Psychological Bulletin 





Eysenck’s Treatment of the 
Personality of Communists 


The Psychology of Politics and 
the Personality Similarities 
Between Fascists and Communists 


The Quantitative Study of Shape 
AND MALcoLom D. ARNOULT 


The Normal Curve and the Attenuation 
i Lioyp G. HUMPHREYS 


The Ability of Human Operators to 
Detect Acceleration of Target 


On the Origin and Early Use of the Term 
Vicarious Trial and Error (VTE) 


This is the last issue of Volume 53. 


Volume contents and title page 
appear herein. 





Published Bimonthly by the 


NOVEMBER, 1956 





Wayne Dennis, Editor 
Brooklyn College 


LORRAINE BOUTHILET, Managing Editor 
Consulting Editors 
Launor F. CarTER RoBERT L. THORNDIKE 
RAND Corporation Teachers College, Columbia University 
Santa Monica, California BENTON J. UNDERWOOD 
Epwarp GIRDEN Northwestern University 


Brooklyn College S. Rarns WALLACE 
Victor C. Rammy Life Insurance Agency 


Unswersity of Colorado Management Association 





The Psychological Bulletin contains evaluative reviews of research literature and 
articles on research methodology in psychology. This JouRNAL does not publish re- 
ports of original research or original theoretical articles. 

Manuscripts should be sent to Wayne Dennis, Department of Psychology, Brook- 
lyn College, Brooklyn 10, New York. 

Preparation of articles for publication. Authors are strongly advised to follow the 
general directions given in the ‘Publication Manual of the American Psychological 
Association” (Psychological Bulletin, 1952, 49 [No. 4, Part 2], 389-449). Special 
attention should be given to the section on the preparation of the references (pp. 
432-440), since this is a particular source of difficulty in long reviews of research 
literature. All copy must be double spaced, including the references. All manuscripts 
should be submitted én duplicate. Original figures are prepared for publication; dupli- 
cate figures may be photographic or pencil-drawn copies. Authors are cautioned to 
retain a copy of the manuscript to guard against loss in the mail. 

Reprints. Fifty free offprints are given to contributors of articles and notes. Au- 
thors of early publication articles receive no gratis offprints. 

Communications—including subscriptions, orders of back issues, and changes of 
address—should be addressed to the American Psychological Association, 1333 Six- 
teenth Street N.W., Washington 6, D. C. Address changes must reach the Subscrip- 
tion Office by the 10th of the month to take effect the following month. Undelivered 
copies resulting from address changes will not be replaced; subscribers should notify 
the post office that they will guarantee second-class forwarding postage. Other claims 
for undelivered copies must be made within four months of publication. 

Annual subscription: $8.00 (Foreign $8.50). Single copies, $1.50. 





PUBLISHED BIMONTHLY BY: 


THE AMERICAN PSYCHOLOGICAL ASSOCIATION, INC. 


Menasha, Wisconsin 
and 1333 Sixteenth Street N.W., Washington 6, D.C. 
eageed Se satend Sime call matty, ot the gest sie & Nattewen, E.G, « np lee Pde et pam BS 1879. 


— a, at the rate of postage May 


at Menasha, 
US 2 Sh act of F #5°1925, authorised A 
a — 5 4 Yate ae he cater chaer ones of Pabresry authorized August 6. 


Copyright, 1956, by The American Psychological Association, Inc. 





VoL. 53, No. 6 


NOVEMBER, 1956 


Psychological Bulletin 





EYSENCK’S TREATMENT OF THE PERSONALITY 
OF COMMUNISTS 
RICHARD CHRISTIE 
Columbia University! 


A current problem in personality 
theory is that of the relationship be- 
tween personality variables and sus- 
ceptibility to deviant political ideolo- 
gies A amount of 
evidence has been collected on indi- 
viduals on the right-wing of the body 
politic. A current source of frustra- 
tion for American psychologists, how- 
ever, is the paucity of relevant data 
on members of the extreme left. For 
obvious recent years have 
a marked shrinkage in the size 
of this population and an increase 
in sampling difficulties. It is therefore 
of interest to find that attention is 
directed toward the personality char- 
acteristics of communists, among 
others, by H. J. Eysenck of Maudsley 
Hospital, London, in his recent book 
The Psychology of Pelitics (8). 

One of Eysenck’s major conten- 
tions is that communists and fascists 
are similar in being ‘“‘tough-minded” 
and ‘“‘authoritarian.”’ This is a highly 
plausible However, a 
careful examination of his data indi- 
cates that if his measures of these 
attributes were valid, quite different 
conclusions would be drawn. 

The present critique shall be re- 


considerable 


reasons, 


seen 


hype thesis. 


! This article was written while the author 
was a Fellow at the Center for Advanced 
Study in the Behavioral Sciences. An earlier 
draft was substantially modified as a result of 
suggestions made by colleagues. Ramon J. 
Rhine assisted the author in statistical calcu- 
lations 


stricted to methodological points and 
their implications for a more ade- 
quate understanding of the relation- 
ships between certain aspects of per- 
sonality and_ political ideology. 
Eysenck's interpretation of his data 
on communists and fascists is crucial 
for his theoretical schema but a 
thorough evaluation of the latter 
would add unduly to the length of 
this paper.? Detailed documentation 
in support of the present criticism 
will be presented and only those inac- 
curacies and inconsistencies of Ey- 
senck’s which are pertinent to our 
specific topic shall be cited. 


ARE COMMUNISTS AND FASCISTS 
SIMILAR IN BEING ‘‘TOUGH- 
MINDED"? 


References to the finding that 
communists and fascists differ from 
less politically deviant samples in 
being more “‘tough-minded” are scat- 
tered throughout The Psychology of 
Politics. This is considered demon- 
strated by scores on a scale designed 
to measure ‘‘tough-tender-minded- 
ness." Examination of the evidence 
indicates: (a) that no confident gen- 
eralizations are justified upon the 
basis of the sampling procedures used 
in selecting samples from the parent 
populations; (b) that the scale does 
not measure ‘‘tough-mindedness,”’ 


2A review of The Psychology of Politics by 
the writer may be found in The American 
Journal of Psychology, 1955, 68, 702-704. 


411 





412 RICHARD 


at least among communists; and (c) 
that Eysenck engages in misleading 
manipulations of communist and 
fascist test scores in contrasting 
them to various “neutral groups.” 

Purported evidence for tough- 
mindedness comes from two studies. 
The first was conducted by Eysenck; 
the second was an unpublished doc- 
toral dissertation done at the Uni- 
versity of London by Thelma Coulter 
which is cited by Evsenck. They shall 
be examined separately. 


‘ 


THe Eysenck Stupy 

In The Psychology of Politics Ey- 
senck states, “When we average the 
average scores of the groups on the 
T factor, i.e., without paying atten- 
tion to the fact that the number of 
cases is different between the groups, 
we find that the Liberals are the most 


tender-minded with a score of 7.7; 
that the Socialists and Conservatives 
follow next, with a score of 7.0; and 
the combined Communist-Fascist 
group much the most tough- 
minded score (5.5)"* (8, pp. 137- 
138). In evaluating this conclusion 


has 


the sampling, measuring, and analy- 
sis procedures shall be treated in 
order. 


Sam pling 


The largest group of subjects were 
middle-class adherents of the Con- 
servative, Liberal, and Socialist par- 
ties. A smaller sample of working- 
class members of the same parties was 
also obtained. In addition, 
munist subjects were recruited from 
two branches the Communist 
party and a few fascists were obtained 
in an unspecified manner. 

The middle-class sample. 
This was composed of 250 middle- 


com- 


ot 


basic 


* Socialists refer to members of or voters 
for the British Labor Party. 


CHRISTIE 


class members of each of the three 
major British political parties. The 
clearest statement of the sampling 
procedure may be found in an article 
published in 1947 (5, pp. 53-58). 
Students in university classes, uni- 
versity extension 
W. E. A. 
give from five to fifteen question- 
naires each to friends and acquaint- 
ances and have them answered. In 
this fashion 317 usable questionnaires 
were collected from individuals iden- 
tifving themselves as supporters of 
the Conservative party, 256 Liberals, 
and 409 Socialists. Three samples of 
250 each were drawn from each of the 
parties so that they were roughly 
equated for age, sex, and education. 
The respondents came from an urban 
background (5, p. 54). 

The working-class No 
specific information is given as to 
how this sample was selected. It 
inferred that they were also given 
questionnaires by members of Ey- 
he notes that, 
‘The method of selection adopted has 
been explained in some detail in the 
first paper of this series . (6, p. 
200). Since the paper referred to was 
devoted exclusively to middle-class 
respondents it cannot be determined 
whether the working-class question- 
naires were obtained at the same 
time were those of the middle- 
class respondents or at a later date. 
The number of protocols is much 
smaller being 65 for the Conserva- 
tives, 27 for the Liberals, and 45 for 
the Socialists (6, Table I, p. 201). 
These respondents were also urban 
but there were no controls on age, 
sex, or education (6, p. 200 

The communist sample. The pro- 
cedure utilized in sampling com- 
munists differed since these subjects 
were recruited directly through the 
party organization. The total 


and in 
classes were required to 


classe Ss, 


sample. 


is 


senck’s classes since 


as 


in- 





EYSENCK’S 


formation available is: ‘‘Contact was 
made with Party Branches through a 
the Communist Party 
who undertook to collect the ques- 
tionnaire replies. He used two differ- 
ent branches, one primarily working- 
class, the other primarily middle- 
Relatively few refusals were 


member of 


class. 


encountered among those approached, 


in spite of a feeling that this type of 
work was ‘futile’”’ (6, p. 200). Fifty 
proto ols were ¢ ollec ted {rom middle- 
class communists and 96 from work- 
ing-class communists. (6, Table I, 
p. 201). 

The fasei The only in- 
formation available as to the recruit- 

nt of these subjects is the single 
“Only 
persons could be found who were fol- 
Mosley and may properly 
be called ‘fascists’ "’ (6, p. 206). 


. sare? 
Com} ir 


} 
st sample. 


sentence, seven middle-class 


lowers ol 


the 
earlier article dealing with the mid- 
dle-class respondents Eysenck argued 
for what he termed analytic sam- 
pling, i.e., he was interested in com- 
paring attitudes of members of the 
three major parties when other varia- 
bles or possibly affecting) 
attitudes held age, 
sex, education, and place of residence 
(5, pp. 53-58). This is a legitimate 
approach and there is no quarrel with 
it since 


ons of samples. In 


affecting 


were constant 


no significant differences 
found to be related to the first 
three of the basic middle- 
class sample, controls on them were 
dropped for the working-class, com- 
Such a 
procedure is based upon an implicit 
assumption that if there are no rela- 
tionships between certain variables 
in one sample, there will be none in 
a sample of quite a different nature. 
Such an assumption may be valid 
but it needs to be demonstrated be- 
cause it has been shown that different 
relationships between attitudinal var- 


were 


these in 


munist, and fascist samples. 


TREATMENT OF THE PERSONALITY 


OF COMMUNISTS 413 
iables hold in middle- and working- 
class samples (4, pp. 171-172; 10, 
pp. 58-61). In other words, differ- 
ences found by Eysenck between 
middle- and working-class adherents 
of the major parties might well be a 
result of uncontrolled factors and not 
simply a result of different class mem- 
bership. 

These same criticisms might be ex- 
pected to apply with even greater 
force to comparisons between mem- 
bers of major British political parties 
and those belonging to the commu- 
nist or fascist parties. Almond’s sam- 
ple of middle-class communist de- 
fectors indicates strongly that they 
were a deviant group from 
their political idiosyncrasies (2). 
There is another problem which may 
lead to bias in the comparisons be- 
tween the communists and others. 
The communists’ were recruited 
through an organization which im- 
plies active political interest: adher- 
ents of the major political parties 
may or may not have been politically 
active since they classified themselves 
as to “... the group in which you 
would include yourself’ when pre- 
sented with a list of parties (5, Table 
I, Q. 47, p. 78). It is well known 
that people who belong to groups 
differ in many from tose 
who do not (10, pp. 61-63 Now it 
may be argued that communists are 
by definition active group members 
and thus differ from the majority of 
the rest of the population. It would 
nevertheless be extremely important 
to know whether they are less differ- 
ent from those who are politically 
active in major parties than from 
those who merely list themselves in 
a particular party when asked to do 
so. In short, to what extent are dif- 
ferences in attitudes between com- 
munists and major party members 
traceable to ideology per se and to 


aside 


respects 





414 RICHARD 


what extent to other factors relating 
to political activity? 

Eysenck’s failure fully to consider 
the implications of biases in sampling 
can best be illustrated by his discus- 
sion of the representativeness of his 
basic middle-class sample. He quite 
rightly notes that the initial results 
should not be generalized to rural or 
working-class populations. He subse- 
quently admits that all British urban 
middle-class people did not have an 
equal probability of being drawn 
since his students were not suffi- 
ciently widely acquainted. He then 
says, ‘‘. . . it seems unlikely that this 
principle would affect very many 
middle-class people, or that it would 
be correlated in any systematic way 
with the type of attitude which is 
being studied. Careful scrutiny of 


the papers written by the students, 
and verbal questioning after discus- 
sion of sampling procedures, did not 
reveal any suggestion that our sam- 


ples were seriously biased; while this 
conclusion cannot, of course, be ac- 
cepted as definite proof, it is perhaps 
near enough the truth not to affect 
our conclusions in a very serious man- 
ner’ (5, pp. 57-58). 

Itisthe present contention that very 
serious biases were present in Ey- 
senck’s basic middle-class sample as a 
result of the sampling procedures. 
Comparisons of his samples with esti- 
mates of the parent middle-class pop- 
ulation indicate this very clearly 

First, consider the age distribu- 
tion of Eysenck’s basic middle-class 
sample. He dichotomized sample 
members as older and younger, the 
cutting point being thirty years of 
age (5, p. 57). The actual range and 
distribution of ages of respondents 
is not given (although curiously 
enough the range of ages of the stu- 
dents collecting the data is given!) 
(5, p. 54). Presumably, the respond- 
ents must have been of or almost of 





CHRISTIE 


voting age or the results would be 
almost meaningless. According to 
present calculations approximately 
20 per cent of the adult British popu- 
lation in 1951 was in the 20-29 year 
age range with 80 per cent falling 
over thirty years of age (based on 14, 
Table 7, p. 8). Yet 64.9 per cent 
(487 out of 750) of Evsenck’'s re 
spondents were under 30 years of age 
(calculations based on 5, Table 3, 
p. 78). What has happened is quite 
clear. Eysenck’s students tended to 
choose as respondents friends and 
acquaintances who were near their 
own age level and the entire distribu- 
tion was skewed toward the vounger 
age groups. 

In view of this, the fact that 
Evsenck found only a slight but not 
significant tendency for the younger 
members of his sample to be more 
radical is not at all puzzling. Evsenck 
says in to. this 
“The failure the old 
sample, to be more Conservative 
than the young, is perhaps in 
opposition to expectation..." (5, 
p. 68). It is an elementary statistical 
principle that a truncated distribu- 
tion obscures relationships and this, 
it is suggested, is the most probable 
reason for the lack of differentiation 
between Eysenck’s so very voung, 
“young group and his not so old, 
“old”’ group. 

A similar criticism may be leveled 
against the educational bias in this 
sample. Those ‘‘who have had a uni- 
versity education” totaled 7 per 
cent (computed from 5, Table 3, p 
78)—the precise definition of uni- 
versity education, whether merely 
attendance or graduation is not speci- 
fied. There are no figures available 
which give the number of university 
graduates or of those with some uni- 
versity attendance in Great Britain. 

It is possible, however, to piece 
together bits of information which 


reference 
ot 


point, 


in our 


Y. 
iso 


- 
4. 





EYSENCK’S TREATMENT OF THE 


indicate that Eysenck’s sample is 
extremely highly educated as con- 
trasted to the British middle-class. 
British census data give a detailed 
breakdown of university attendance. 
In 1950-51 there were 102,012 full 
and part-time university students in 
Great Britain who were taking 
(14, Table 108, p. 90) and 
17,337 first degrees were given (com- 
puted from 14, Table 111, p. 92). 
Comparable figures for the United 
States in 1950 are 2,659,021 students 
attending colleges and universities 
(16, Table 140, p. 125) and 432,058 
(16, Table 139, p. 124). 
For every British university student 
there were 26.07 American students; 
for every British first degree there 
were 24.92 American first degrees. 
When corrections for the total popu- 
lations of the two countries are made 
it is apparent that the ratio of uni- 
versity students in England and the 
United States is approximately one 
to eight or nine. It is impossible to 
determine the effects of foreign stu- 


courses 


first degrees 


dents these comparisons al- 
though it is believed not to affect the 
preceding comparisons markedly; less 
than 10 per cent of British full-time 
students in 1950-51 were from out- 
side the United Kingdom (computed 
from 16, Table 108, p. 90). 
Fortunately, for present purposes, 
data in the United 
(since 1940) contain estimates 
the amount of education re- 
Thus 1950, 5,784,570 


upon 


recent 
States 
to 


census 


iS 


ceived. in 


Americans claimed four or more years 


of college (based upon a 3} per cent 
sample) (15, Table A, p. SB-12). 
Thus 5.9 per cent of the 97,403,307 
Americans over 21 years of age in 
1950 claimed college graduation. If 
roughly half the American adults 
were to be considered middle-class 
the proportion of college graduates 
would be around 12 per cent among 
them (since education is a measure of 


PERSONALITY OF COMMUNISTS 415 
class). A recent estimate by the 
Census Department indicates that 
15.4 per cent of the adult American 
population has had some college edu- 
cation (or roughly 30 per cent of the 
middle-class) (cited in 13, p. 238). 

Since age distributions and the rel- 
ative rate ef growths of institutions 
of higher learning differ slightly in 
the United States and Great Britain 
present estimates are rough. Applica- 
tion of the ratios previously deter- 
mined to the preceding figures would 
suggest that the proportion of adults 
in the British middle-class (similarly 
assuming roughly half the population 
as middle-class) having a university 
degree would not be much above 2 
per cent and that of those having 
some university education above 5 
per cent. These are rough estimates 


‘This is a crude estimate. Centers (3, 
Table 8, p. 57) broke down a 1945 Gallup 
cross-sectional sample of white males of 21 
years of age or over in the United States into 
the following groupings for urban residents 
all business, professional, and white collar 
(N =430); all urban manual (V=414). The 
rural categorization was: farm owners and 
managers (N = 153); farm tenants and laborers 
(N=69). If the initial categories are con- 
sidered middle-class, slightly over half the 
sample would be so classified. If nonwhites 
had been included the proportion of middle- 
class would presumably decrease slightly. In 
view of sampling errors (3, Table 1, p. 38) an 
estimate of 50 per cent seems a reasonable 
approximation. 

* The question of the comparability of the 
proportions of the population in similar classes 
in Great Britain and the United States is a 
puzzling one since different criteria are appar- 
ently used by the Gallup organizations in the 
two countries. Centers found 43 per cent of 
his sample classified themselves as middle- 
class (3, Table 18, p. 77). Eysenck presents 
data on British class identification (8, Table 
III, p. 18). Present calculations indicate that 
41.7 per cent identified themselves as either 
middle or lower-middle class (in obtaining this 
figure the computed total of 8,890 was used as 
a denominator since Eysenck’s addition is 
erroneous). Since subjective class identifica- 
tion is substantially correlated with external 
ratings of class membership, these figures sug- 





416 


but it is believed that they are not 
grossly in error. They are so far be- 
low the 57.7 per cent of Evysenck’'s 
sample that it is clear that his sample 
was completely unrepresentative of 
the British middle-classes. 

It would be to continue 
demonstration of other aspects of the 
nonrepresentative nature of Eysenck’s 
middle-class sample. Such 
might well be expected to have a 
major effect on the attitudes elicited 
from subjects. Evysenck notes that 
the correlation between social class 
and political attitudes in Great Brit- 
ain is .67 (8, p. 19), “* .. . that social 
estimates are determined 
most completely by social status.” 
(8, p. 20) and that, * . education 

. is of course so closely related to 
status that the results are almost a 
foregone conclusion” (8, p. 20). 

In view of the fact that Eysenck’'s 
basic middle-class sample is markedly 
unrepresentative of the British mid- 
dle-classes, it would be highly danger- 
ous to project their attitudes to ob- 
tain an estimate of the parent popu- 
Yet Evsenck suggests this 
possibility both in an earlier article 
(5, p. 57) and in The Psychology of 
Politics (8, p. 127) 

Of crucial importance for the pres- 
ent discussion, however, is the fact 
that the comparisons of scale scores 
among groups belonging to differ- 
ent social classes and political parties 
embody not only these differences 
but many 
as well. No contidence can be placed 
in the generality of the differences 
in scores found among the groups 
studied with the exception of com- 
within the middle-class 


tedious 


biases 


class al- 


lations. 


other uncontrolled biases 


parisons 


gest that the proportion of middle-class indi- 
viduals in the two countries is not too dis- 


similar. However, the wording of the alterna- 
tives in the two surveys differed and differ- 
ences in the meaning of class labels in the 
countries is an unknown factor. 


RICHARD CHRISTIE 


where age, sex, and education were 
roughly controlled. 


The Measurement of 
edness” 


Rokeach and Hanley (12) 
discussed Eysenck’s T (‘‘tough-mind- 
edness’) factor.’ A re-examination of 
the portions of Evysenck’s work to 
which they refer clearly indicates 
that the mean scores reported by 
Eysenck are in disagreement with 
the data which he reported. Aside 
from such computational errors, there 
are other aspects of the T scale which 
are relevant in any attempt to un- 
cover the significance of scores on the 
T scale made by samples of different 
political affiliation. 

Three biasing factors which are 
empirically related to the 
made on the T scale among 
ples examined by Eysenck have been 
uncovered. These are: (a) the treat- 
ment of the ‘‘no-answer’’ category, 
(b) the asymmetric 
scale, and (c) the different interpreta 
tion of the items among various sam- 
Each of these biasing effects 
shall be considered separately. 

Treatment of the cate- 
gory. In most attitude s the 
treatment of neutral categories is ex- 
plicitly or implicitly based upon the 
assumption that 
does not have an attitude, can not 
make up his mind as to the answer to 
the specific question asked, or other- 


‘Tough- Mind- 


have 


scores 


the sam- 


nature of the 


ples. 


““no-answer 


iles 


a resp ynndent who 


wise does not agree or disagree, is not 
an extremist in terms of whatever the 
scale presumably measures. Likert- 
type scales are constructed so that 
such a reply (or lack of one) to an 
item is intermediately be- 
tween acceptance and rejection. 

The scoring system employed by 
Eysenck is based upon other (unspec- 


scored 


* The present critique has been modified as 
a result of Rokeach and Hanley’s analysis to 
minimize duplication. 








EYSENCK’S 


Nine of the 
Fig. 1) are 
and tive are “‘ten- 
Respondents are al- 


ified ) 
fourteen items 
“tough-minded” 
der-minded.”’ 

lowed to choose among the follow- 
“strongly approve” 
“approve on the whole” (+), 


assumptions. 


(see 


ing alternatives: 
“can't decide tor or against, or if vou 
think that the 


quately worded” 


question is inade- 


zero), “disapprove 
“strongly dis- 
8, p. 122) If the 


particular item being scored is one of 


on the whole’ (- . 
approve” _ = a 


the five tender-minded ones, agree- 
ment, whether “strong” or ‘‘on the 
whole,” is given one point. All forms 
of neutrality 
arbitrarily 


and disagreement are 
the T 
If the item in question happens 
to be one ot 


scored as zero on 
™ ale 
the nine tough-minded 
it of anv varietv ts 


Any 


responses are 


ones, disagree ne! 


given a weight of one form of 


agreement and ‘zero’ 
given no Wwe ight 

Such a scoring system has interest- 
A respondent 
who disagreed with everything, what- 


the 


ing inherent properties 


ever automati- 


Disgruntle- 


content, would 
cally have a score of nine. 
ment, in this case, leads to a score 
which is more tender-minded (bv vir- 
than would 
who dacC- 
Acceptance how- 


all items would result 


tue of the scoring svstem 
be true in the case of one 
cepted all items. 
ever strong, ot 
i a Maximum total score of five. If, 
however, a respondent, for reasons 


noted or others could not agree or 
disagree with amy item his final score 
would be exactly zero. 

The logic of a scale which classities 
a person who disagrees with every 
item of a battery of items as high in 
tender-mindedness and who 
“can't decide for or against, etc."” as 
being at the extreme pole of tough- 
mindedness is dithcult to understand. 

Ot more direct interest ts the extent 
to which Evsenck’'s use of the no re- 


affects the T-scale 


one 


sp mse 


category 


TREATMENT OF THE PERSONALITY OF COMMUNISTS 


417 


comparisons of his samples. If there 
were no systematic differences in re- 
sponse sets among the members of 
the various samples the possibility 
of bias would be largely vitiated. 


Since, however, there are more 


tough- than tender-minded items the 
scoring of the zero category operates 


to make samples characterized by a 
high proportion of indeterminant 
answers more tough-minded. Un- 
fortunately, Eyvsenck does not report 
the frequency of 
among members of the various sam- 
ples. He does, however, roughly re- 
port the frequency of extreme re- 
sponses (++ and —— “ . we 
find that only 35. per the 
sor ialist, liberal and conservative re- 
have marked in this 
fashion, but 54 per cent and 51 per 
cent respectively of the middle-class 
and working-class 
sponses” (6, p. 206 Among the 
fascists, ‘It interest to 
that these subjects were the 
most emphatic of all, their propor- 
tion 
67 per cent” (6, 

rhe that 
since the communist and fascist sam- 
ples checked more extreme responses 
than did members of the three major 
parties, they also had a lower fre- 
zero Such an 
with the 
between ex- 
As 
Evsenck notes in a discussion of pre- 
. when the dif- 
ferent groups who had taken part in 
the study were compared there was 
a marked tendency for the more ex- 
treme groups to be more certain of 
their opinions. This characteristic 
we shall find again in our discussion 
of Communist and Fascist ideologies” 
(8, p. 120). 

Upon the basis of available data it 
be inferred that the 


these responses 


cent of 


sponses been 


communist re- 
seven is ot 
note 
of ++ and —-— scores being 
p. 207) 

indicate 


above figures 


quency ol responses. 
inference is in 
known 
tremity and intensity of attitude. 


agreement 
relationship 


vious rese arch, 


can conserva- 





418 


tives, liberals, and socialists sampled 
had a higher frequency of zero re- 
sponses than did the samples of com- 
munists and fascists. The arbitrary 
system of scoring which treated zero 
responses as tough-minded thus in- 
troduced a bias of unknown extent 
in the direction of making the mem- 
bers of the three major parties more 
tough-minded, relatively speaking, 
than those of the two deviant parties. 

The asymmetric nature of the scale. 
In the present analysis of the T scale 
considerable importance is attached 
to the fact that the items also meas- 
ure radicalism and conservatism. Of 
the fourteen items, nine have a 
higher saturation on R (radicalism 
and conservatism) than on T. No 
item is clearly an independent meas- 
ure of T. The original T scale was 
based upon a factor analysis of 40 
items responded to by the basic mid- 
dle-class sample previously discussed. 
A total of 23 items had saturations of 
+.20 or greater upon the T factor. 
It is impossible upon the basis of an 
inspection of the saturations to de- 
termine why some items were in- 
cluded in the T scale when other 
items with higher saturations were 
not (8, Table XX, p. 129). This is 
in direct conflict with what Eysenck 
says in The Psychology of Politics: 
° . we must obviously construct 
measuring instruments for R and T 
respectively. Two scales were ac- 
cordingly constructed by combining 
the items most highly correlated with 
the two factors respectively, each scale 
consisting of 14 items" (8, p. 133, 
italics mine). 

The reliability of the T scale was 
.64 (.80 when corrected by the Spear- 
man-Brown formula) and .81 on the 
R scale (.90 corrected) (5, p. 65). The 
lower reliability of the T scale is not 
surprising since the variance ac- 
counted for by each is 8 and 18 per 
cent respectively (5, p. 59), some of 


RICHARD CHRISTIE 


the items in the R scale have practi- 
cally no saturation on T, and the 
lowest saturation of any R-scale item 
on R is .45 in contrast to the low of 
.20 of T-scale items on T (8, Table 
XX, p. 129). 

The crucial point in an interpreta- 
tion of Eysenck's results is that the T 
scale is a somewhat better measure of 
R than T. The mean loading of T- 
scale items on T is .38, on R, .48 (cal- 
culated from data in 8, Table XX, p 
129). 

If we consider, as Eysenck does, 
communists to be both tough-minded 
and radical, tough- 
minded and conservative, conserva- 
tives to be tender-minded and con- 
servative, and socialists to be tender- 
minded and radical, certain conse- 
quences follow in the determination 
of the scores of 
parties. An examination of Fig. 1 
indicates that there are different 
numbers of items in the four quad- 
rants. When this fact 
with Eysenck’'sscoringsystem,strange 
results may be expected. Assume 
that a hypothetical consistent com- 
munist and his counterpart 
were answering the items in the T 
scale, hypothetical perfection being 
defined as being well indoctrinated 
in their respective ideologies (radical 
and conservative), and that both 
were equally tough-minded. Both 
would receive a total of exactly no 
points for rejecting the five tender- 
minded items in the T The 
hypothetically perfect communist 
would receive five points for reject- 
ing the five items in the conservative 
tough-minded quadrant and no points 
for accepting (however strongly) the 
four items in the radical tough- 
minded quadrant. The consistent 
fascist, on the other hand, should re- 
ject the four items in the radical 
tough-minded quadrant (four points) 
but would receive no points for ac- 


fascists to be 


these 


members of 


is combined 


fascist 


scale. 








EVSENCK'S TREATMENT OF THE PERSONALITY OF COMMUNISTS 419 


“Tough - minded” 





Radical Conservative 


70 60 , 4 2 ‘ 30 40 50 60 ,70 


"Tender- minded” 


Fic. 1. Pror or tHe 14 T-Scace Items ON THE T AND R Axts (TAKEN FROM S, Pp. 81). THERE 
Is a CORRELATION OF —.12 BETWEEN R aNnp T (5, P. 66). 

The actual items are (5, Table 1, pp. 76-77 
Radical ** Tough-minded" Quadrant 

29. Men and women have the right to find out whether thev are sexually suited before 

marriage (e.g., by companionate marriage 

9. Sundav-observance is old-fashioned, and should cease to govern our behavior. 
23. Divorce laws should be altered to make divorce easier 
5. The laws against abortion should be abolished 
Conservative *‘ Tough-minded"’ Quadrant 

13. Conscientious objectors are traitors to their country, and should be treated accordingly 
1. Colored people are innately inferior to white people 
3. War is inherent in human nature 
39. The Japanese are by nature a cruel people 


5. Persons with serious hereditary defects and diseases should be compulsorily sterilized 
‘onservative ‘ Tender-minded" Quadrant 


16. Only by going back to religion can civilization hope to survive 
28. It is right and proper that religious education in schools should be compulsory. 
Radical ‘‘Tender-minded" Quadrant 
10. It is wrong that men should be permitted greater sexual freedon than women by society. 
8. In the interests of peace, we must give up part of our national sovereignty. 
36. The death penalty is barbaric, and should be abolished 








$20 RICHARD CHRISTIE 


cepting the five conservative tough- 
minded items. 

By virtue of an asymmetric dis- 
tribution of items combined with 
Evysenck’s singular scoring system, 
a hypothetically consistent fascist 
is automatically made more tough- 
minded by one point than a hypo- 
thetically consistent communist. The 
contusion inherent in such a scoring 
system becomes the more puzzling 
since Eysenck persists in lumping 
fascists and communists together as 
being tough-minded. 

A similar analysis indicates that 
a hypothetically consistent socialist 
should be more tender-minded by 
one point than a hypothetically con- 
sistent conservative. Since many of 
the differences in mean scores on the 
lr scale are less than a point apart, the 
preceding indication of the impor- 
tance of differential weighting aris- 
takes 
The field of 
measurement is bedeviled 
with enough problems without in- 
cluding with built-in 


ing from the scoring system 
on crucial importance. 
attitude 


scales biases 


based upon unspecified assumptions. 


Interpretation of the items. In the 
preceding section we have dealt with 


the responses which a hypothetical 
communist might make to the T 
scale. The underlying assumption 
was that he should reject all items 
except those which were radical 
and tough-minded since these are the 
characteristics which Eysenck at- 
tributes to members of the Commu- 
nist Party. An examination of Ey- 
senck’s data indicates that the com- 
munists sampled responded in quite 
a different fashion. 

Table 1 shows the mean percent- 
age acceptance of items falling in 
these quadrants by communists and 
members of other parties (with the 
exception of the small fascist sample 
for whom data are not given Both 
middle- and working-class commu- 
nists show markedly greater accept- 
ance of radical tough-minded items 
and greater rejection of conservative 
tender-minded items than 
affliated with other parties. 


subjec ts 

These 
results are completely in line with 
Evsenck’'s analysis and the expecta- 
tions of anyone familiar with politi- 
cal attitudes. 

However, the theoretically crucial 
responses communist 
to T-scale items which 


are responses 


fall into the 


TABLE 1 


MEAN PERCENTAGE ACCEPTANCE OF T-SCALE ITEMS BY QUADRANT, Soctat CLass, 


AND POLITICAL 


Middle-Class 
Item Nos.* 


Comm 


Tough-minded 
Radical 
29.9. 73.15 
Conservative 
13, 1, 3, 39, § 


lender-minded 
Conservative 
16, 28 
Radical 
10, 8, 36 &4 


* Taken fron 


6, Table III, p. 203 


Cons 


AFFILIATION*® 


Working-Class 


Comm Sox Lib 








EYSENCK'S TREATMENT 


other two quadrants. Both middle- 
and working-class communists are 
markedly less receptive to the items 
falling in the conservative tough- 
minded quadrant than other sample 
members and more receptive to those 
items falling in the tender-minded 
radical quadrant. Examination of 
these figures suggests that the com- 
munists sampled are not responding 
to the tough- or tender-mindedness 
of T-scale items but rather to the 
radical or conservative content. Our 
calculations indicate that the middle- 
class communist sample had a mean 
acceptance of 91 per cent of the seven 
T-scale items (four tough- and three 
tender-minded) with a radical satura- 
tion and only 7 per cent of the seven 
items (five tough- and two tender- 
minded) with a conservative load- 
ing. Comparable figures for working- 
class communists are 81 and 14 per 
On the other hand, if the re- 
the tough-minded 
items are examined, the mean accept- 
ance is 47 per cent by both middle- 
and working-class communists and 


cent. 


sponses to nine 


mean acceptance of the five tender- 
minded items is 47 and 50 per cent 
respect ively. 

The present interpretation of these 
figures is simple. The communists 
sampled by Evsenck responded to the 
[ scale mot upon the basis of their 
loadings on tough- and tender-mind- 
edness but responded directly in terms 


} y } 
of their radical-conservative loading. 


The T scale simply does not apply to 


communists least to those 
sampled). 
made by communists on a scale on 
which they did not respond along the 
continuum measured with scores 
made by other samples are meaning- 


less. 


(or at 


Comparisons of scores 


Analysis. An analysis of Eysenck’s 
treatment of results also leads to 
questions of interpretation. Let it be 
assumed (as it assuredly is not) that 


OF THE PERSONALITY OF COMMUNISTS 


421 


there are no problems in sampling 
or measurement in the data which 
Eysenck reports. Let it further be 
assumed (despite our agreement with 
Rokeach and Hanley’s recomputa- 
tions) that Evsenck’s addition is cor- 
rect. It is still possible to raise ques- 
tions about the manner in which the 
data are treated. 

It has been previously noted that 
the comparisons of various ‘‘groups”’ 
involved an ‘‘average of an average 
score."’ The reported score of liberals 
on the T scale thus was based upon 
the singular procedure of adding the 
mean (7.9) of 250 middle-class lib- 
erals (as sampled) to that (7.4) of 27 
working-class liberals (as sampled) 
and dividing by two and then round- 
ing up the “average of an average’ 
of 7.65 to 7.7 in deriving the tough- 
mindedness score of liberals. 

The results obtained from the “‘av- 
erage of an average” treatment lead 
to even more remarkable results when 
applied to the combined communist- 
fascist samples. The only way in 
which the writer is able to arrive at 
the ‘average of an average” 
ported by Eysenck is as follows: add 
the mean T-scale score of the 50 mid- 
dle-class communists to that of the 96 
working-class communists and divide 
by two which gives a score of 6.4; 
take the mean score of seven fascists 
and add it to the previous figure, di- 
vide by two, and round down the 


score re- 


“average of an average’ of 5.55 to 
5.5 to obtain the figure given by 
Eysenck.? 

Various possible comparisons of 
the scores of various samples (taking 


7 An examination of the rounding practice 
followed by Evsenck indicates a systematic 
procedure. Contrary to more customary pro- 
cedures of rounding in a consistent direction, 
to odd or even numbers, Eysenck’'s 
roundings in these comparisons are such as 
to maximize the discrepancy between com- 
munist-fascist and other political groupings. 


etc., 





422 


middle- and working-class together) 
are given in Table 2. When the mean 
score is used instead of the “‘average 
of an average’’ on Eysenck’s reported 
sample means the communists are 
less deviant from the other groups. 
If a mean is taken on the T scores 
as recomputed by Rokeach and Han- 
ley they become even less deviant. If 
one wished to weight the middle- and 
working-class means by their rela- 
tive proportion in various political 
parties still different figures would be 
found. 


rABLE 2 


COMPARISON OF TOoUGH-MINDEDNESS 
Scores OF MEMBERS OF VARIOUS 
SAMPLES BY PARTY 


Mean of 
RKokeach 
and 
Hanley's 
Meanst 


Pa ly 


Liberal 
Conservative 
Socialist 
Communist 
Fascist 


7 et Met Mee | 


vv 


yp. 137-138). 
8. Table XXIII, p. 138 
12, Table 2. p. 171 


It can only be concluded that 
Evsenck presented his data in such a 
way as to maximize the differences 
between communists and fascists on 
the one hand and other political par- 
ties on the other. The differences in 
mean T-scale scores of various sam- 
ples are less than the errors that 
might be reasonably expected to oc- 
cur from sampling biases and the 
peculiarities of the scoring system. 
It is impossible to place any reliance 
in the T-scale differences among vari- 
ous samples even if Eysenck’s unu- 
sual arithmetic practices are replaced 
by more conventional techniques. 


THE COULTER STUDY 


Eysenck believes that Coulter's 





RICHARD CHRISTIE 


study represents confirmation of his 
own findings. He states, ‘These re- 
sults (Coulter's) bear out in every 
detail the results of the previous 
study (Eysenck’s), and we may ac- 
cordingly conclude that our main 
hypothesis is strongly supported”’ 
(8, p. 142). This assertion is believed 
to be unjustified upon the basis of 
available data.® 


Sampling 

Coulter gave a battery of tests to 
three samples. All were composed 
of British working-class males (8, p. 
142). One was a “neutral’’ sample 
of either 86 (8, pp. 142, 202) or 83 
soldiers (8, p. 152). The criteria for 
selection are not specified by Eysenck 
although he states that they “*. . . con- 
stituted a fairly random sample of 
the British working-class males” (8, 
p. 142). No information is given as 
to whether these soldiers were volun- 
teers or conscripts. Since military 
samples underrepresent older age 
groups and those older men in the 
Army tend to be “Old Army Men” 
who are certainly not typical of the 
working-class population, it is most 
unlikely that such a group would even 
roughly approximate a random sam- 
ple of the working-class. 

Coulter's communist and fascist 
samples were each composed of 43 
working-class males. As 
known, no reliable estimates as to 


far as is 


® Discussion of these data is restricted to 
what Eysenck reports concerning them in The 
Psychology of Politics. Neither Coulter nor 
Melvin'’s theses which are germane to the 
topic have been published. A copy of Coul- 
ter’s thesis was examined after completion of 
this critique. It has not been necessary to 
modify present criticisms. Eysenck refers (8, 
p. 276) to “... Melvin (1954)....°° The 
only Melvin listed in the bibliography (8, p. 
301), is, “Melvin, D. An experimental and 
statistical study of two primary social atti- 
tudes. Ph.D. Thesis, Univ. London Lib., 
1953." According to a letter dated Feb. 14, 
1955, from the University of London Library, 
no such thesis had been filed and no informa- 
tion was available concerning it. 








EYSENCK'S TREATMENT OF THE PERSONALITY OF COMMUNISTS 


the characteristics of the parent pop- 
ulations being sampled exist. It is 
therefore impossible to determine the 
representativeness of these samples. 


Measurement 


Coulter used a revised set of R and 
T scales devised by Melvin. The 
latter started with a pool of 60 items 
and factor analyzed the question- 
naires of 650 respondents of unspeci- 
fied origin (8, p. 132). Twenty of the 
forty items used by Eysenck were in- 
cluded—based upon a comparison of 
(6, pp. 208-209) and (8, pp. 277-279). 
Of these eleven were used as measures 
of both R and T by Eysenck (Items 
1, 3, 8, 9, 15, 16, 23, 28, 29, 36, and 
39), three as measures of R only 
(Items 12, 26, and 27), three as meas- 
ures of T only (Items 5, 10, and 13), 
and three were not included in the 
original R and T scales although they 
were in Eysenck’s pool of items 
(Items 6, 18, and 35). 


Melvin added another 40 items 


—although Eysenck says there were 


38 (8, p. 132). An inspection of these 
indicates that they are fairly similar 
to the ones originally used by Ey- 
senck. In the new R and T scales, the 
R scale was expanded to 16 items and 
the T scale underwent drastic revi- 
sion and was expanded to 32 items. 

Of the eleven items measuring both 
R and T in Eysenck’s scaling system 
only two were used in the same fash- 
ion by Melvin (Items 29 and 35). 
One was used as a measure of T alone 
by Melvin (Item 23). The other 
eight did not emerge in Melvin's 
scales. The three original measures 
of T alone are not included in Mel- 
vin's T scale. Two of the original 
measures of R alone used by Eysenck 
perform the same function in Mel- 
vin's revision (Items 21 and 27). The 
other (Item 26) is used by Melvin to 
measure both R and T. 

The scoring system used by Melvin 
is identical to that used in Evsenck's 


423 


original T scale. Twenty of the 32 
items are in the tough-minded direc- 
tion, twelve in the tender-minded 
direction (8, pp. 276-279). It is im- 
possible to determine from the ma- 
terial Eysenck presents whether the 
asymmetry which served as a source 
of bias in his scale is also present in 
Melvin’s and this question cannot 
be answered due to the unavailability 
of the latter's thesis. It is clear, how- 
ever, that the criticism made previ- 
ously of the bias resulting from dif- 
ferential group response sets favoring 
utilization of the no response or zero 
category applies to Melvin’'s revision. 

Eysenck notes that the split-half 
reliabilities of Melvin's revision of 
the R, T, and E (emphasis) scales 
lie between .85 and .95 in “...a 
relatively unselected group” (8, p. 
277). It is Eysenck’s contention that 
Melvin’s research ‘ . showed that 
our original results could be repro- 
duced with an entirely different set 
of items’ (8, p. 132). Data are not 
available to evaluate the accuracy of 
this statement but the point is not 
germane to the present argument. If 
the scales are measuring the same 
dimension there is no reason to be- 
lieve that the uncritical application 
of the T scale to communists would 
not be subject to the same bias as 
that demonstrated in Eysenck’s work. 
If, on the other hand, they are meas- 
uring something different we are left 
in an even more puzzling situation as 
to what comparative scores on the 
tests mean. 

Analysis. The means of the “‘neu- 
tral,"’ communist, and fascist groups 
on the T scale are not given by 
Eysenck. However, the distribution 
of scores of the latter two groups is 
given and a point is presented which 
represents the mean score of the 
“neutral” group (8, Fig. 26, p. 141). 
Our calculations indicate that the 
means for the various groups are as 
follows: ‘‘neutral,’’ 14.2 (interpolated 





424 RICHARD 


approximation); communists, 11.05; 
fascists, 7.85. The striking point in 
this ordering is that the communists 
fall almost exactly midway between 
the “neutral” and fascist samples be- 
ing 3.15 units from the former and 
3.2 units from the latter. Standard 
deviations computed from the dis- 
tributions indicate that the scores of 
the fascists and communists differ 
significantly (CR of 4.18) 

What this difference means is, of 
course, completely puzzling since 
there is no reason to suppose that the 
same vitiating circumstances which 
made the earlier comparisons mean- 
ingless do not apply with equal 
cogency. We see no greater reason 
on the basis of the data for lumping 
communists and fascists as different 
from a ‘“‘neutral’’ group than for dif- 


ferentiating fascists from ‘‘neutrals” 
and communists. 

There is one further matter which 
Eysenck does not touch upon but 


which makes the comparison of 
Coulter's samples somewhat ques- 
tionable. It was noted that a sample 
of soldiers was not very apt to be 
representative of working~<lass males. 
Examination of the mean R-scale 
score of this group raises the possi- 
bility that this is a most unusual 
group ot soldiers. 

Melvin's R scale is eight-sevenths 
the length of Eysenck’'s. If we as- 
sume that acceptance of the items 
tends to be about the same on the 
two scales we can compare various 
groups within the two studies. The 
plausibility of such an assumption is 
indicated by the fact that Eysenck's 
working-class communists had a mean 
score of 10.7 on R (8, Table XXIII, 
p. 138). Multiplying this figure by 
eight-sevenths we find a projected 
mean of 12.23 as the hypothetical 
value of working-class communists on 
Melvin's revision. The actual mean 
computed is 12.90 on Coulter's sam- 


CHRISTIE 


ple (based on 8, Fig. 26, p. 141). 
When we apply the same correction 
to the SD of Evysenck’s sample and 
our own calculations of the SD of 
Coulter's data, a test of significance 
indicates that Eysenck’s and Coulter's 
communist samples do not differ 
significantly in radicalism. This 
conclusion rests, of course, upon two 
assumptions—that the units of meas- 
urement in Eysenck’s and Melvin's 
scales are similar and that the two 
samples are comparable. 

A similar projection cannot be 
made for the fascists since Eysenck’s 
lone seven were middle-class and 
Coulter's were working-class. What 
is of pertinence is the projection of 
scores of working-class members of 
the major parties. Using the same 
procedure as with the communists 
we find that the extension of Ey- 
senck’s R-scale means (8, Table 
XXIII, p. 138) yields projections as 
follows for Melvin’s scale: conserva- 
tives, 3.2: liberals, 4.2; and socialists, 
7.3. If we weight these groups by 
their representation in the British 
working-class population (as given 
in 7, p. 57) we find a projected mean 
of 5.8 on the R scale. Yet Coulter's 
sample made an R-scale score of 
10.8 (interpolated from 9, Fig. 26, 
p. 141). 

Why a group of soldiers in the 
British Army should be so strikingly 
more radical than would be expected 
on the basis of Evsenck’s own findings 
is extremely puzzling. It certainly 
does not argue for the generality of 
any conclusions based upon a com- 
parison of scores made by other 
groups with their own. The lack of 
internal consistency found in the 
analysis of data reported by Eysenck 
clearly indicates flaws in method- 
ology. Some of these we can pinpoint 
with reasonable accuracy; others are 
not so easily traceable since the es- 
sential data are not given. 








EYSENCK'S TREATMENT OF THE PERSONALITY OF COMMUNISTS 


ARE COMMUNISTS AND FASCISTS 
SIMILAR IN BEING “‘AUTHOR- 
ITARIAN''? 

Among the scales given to Coulter's 
samples was the California F scale 
(the particular form used is not speci- 
fied). In discussing this instrument 
Eysenck says, ‘It was entitled the F 
scale because Adorno et a/. considered 
it to be a measure of Fascist poten- 
tial. This interpretation, however, 
as we shall very soon see, is in part 
at least erroneous as we have found 
Communists to make almost as high 
scores on this scale as Fascists, and 
consequently we shall in this book 
refer to the F-scale rather as the 
authoritarianism (8, p. 149). 

A few pages later he reports that 
the K-scale were: ‘“‘neutral”’ 
group, communists, 94; and 
fascists, 159 (8, pp 152-153). The 
fact here is that the com- 
munists do not * make almost as 
high 


scale’ 


Scores 


15: 
obvious 


scores on this scale as Fas- 
the difference be- 
tween communist and fascist scores 
is an extremely large 65 points where- 
as the communists differ from the 
“neutral” group by only 19 points. 
Once Evsenck arbitrarily 
lumps communists and fascists to- 
gether in an attempt to indicate their 
similarity. 


cists... °° since 


again, 


There are some singularly curious 
things about the F-scale scores which 
The 
items means for the three samples 
are: “‘neutral” group, 2.5; commu- 


E-ysenck does not dwell upon. 


nists, 3.13, and fascists, 5.30 (our cal- 
culations). 


The range of possible 
from 1.0 to 7.0 with 4.0 
representing the theoretical neutral 
point. The mean is 
below this point indicating a general 
tendenc y to reject the items. The 
fascists have a high acceptance score 
which represents a striking confirma- 
tion of the validity of the F scale as a 
measure of attitudes and 


scores is 


communists’ 


fascistic 


425 


bluntly refutes Eysenck’s contention 
that the F scale does not measure 
fascist potential. 

The ‘‘neutral"’ group which is so 
fascinatingly aberrant again demon- 
strates its uniqueness. The score of 
2.50 is the second lowest score obtained 
in roughly 50 samples with which the 
writer is familiar. What makes this 
fact so interesting is that this was a 
working-class sample and working- 
class samples tend to make higher 
scores on the F scale than comparable 
middle-class samples (the correlation 
between F-scale and educa- 
tion, usually part of class definition, 
has been estimated as being between 
—.50 and —.60 for American sam- 
ples) (4, pp. 168-170). The only 
known group making a lower score 
than Coulter's neutral group 
sisted of 26 graduate students at the 
University of California (with an 
average of 6 semesters of graduate 
work) who refused to sign a special 
loyalty oath. This group made a 
score of 1.88 (9, pp. 124-126). (A 
comparison group of signers with 
similar education scored 2.73 which 
is higher than Coulter's ‘‘neutral” 
group.) 


scores 


con- 


American college students usually 
score in the 3.0 to 4.0 range on the F 
scale (1, Table 12 (VII), p. 266; 11, 
Table 9, p. 245). If American find- 
ings are applicable to British popula- 
tions we should expect a representa- 
tive sample of British working-class 
males to make even higher scores. 
Such an expectation would seem to 
be in accordance with Eysenck’s own 
data since an examination of the pro- 
portionate acceptance of T-scale 
items by working-class samples as 
constrasted with middle-class sam- 
ples (see Table 1) indicates that 
among sample members of all parties 
the working-class respondents ac- 
cepted more of the items falling into 
the tough-minded conservative quad- 





426 RICHARD 


rant. These have high similarity to 
the sorts of items that enter into the 
F scale and the correlated E and 
PEC scales. 

It is of obvious importance to have 
comparative data on British samples. 
In so far as the writer knows, there 
is no published material of this sort. 
However, Rokeach has recently been 
in Great Britain and administered 
the F scale as well as other measures. 
The item means on two samples of 
university students were 3.26 (VV 
= 80) and 3.57 (N = 137). These tind- 
ings suggest no marked differences 
between American and British uni- 
versity students on the F scale. A 
group of 60 workers at Vauxhall 
Motors made an item mean on the F 
scale of 4.74.° This higher score by 


working-class men as contrasted with 
a college sample is in accordance with 
American findings and does not allay 
suspicion that there was something 
extremely unusual about Coulter's 
neutral’ working-class sample. 


oe 


Eysenck reports that Coulter's 
sample of communists scored higher 
on the F scale than the politically 
“neutral” group. Available evidence 
indicates that communists tend to 
score lower than members of other 
political parties on the F scale. It 
can be argued that Eysenck's own 
material supports the latter point 
of view. An examination of Table 
1 indicates that they were the least 
accepting of all samples of members 
of various parties when it came to 
the items in the conservative tough- 
minded quadrant which, as has been 
noted, are similar to F-scale items in 
meaning. It has also been argued 
that the selection of items in the T 
scale was apparently somewhat ca- 


* Rokeach, M., Personal communication. 
1955. The full implications of Rokeach's 
findings will be developed in his forthcoming 
monograph on political and religious dogma- 
tism. 





CHRISTIE 


pricious. If we examine the items in 
this quadrant which were not in- 
cluded in the T scale—membership 
being determined from R and T fac- 
tor saturations as given in (8, Table 
XX, p. 129)—we find that a highly 
similar pattern of acceptance occurs. 
The mean percentage acceptance of 
the five T-scale items in this quadrant 
by working-class communists is 18, 
while of the seven non-T-scale items 
—17, 22, 26, 27, 30, 31, and 33—it 
is 20. Similar comparisons on the 
most similar group, the socialists,’ is 
44 and 55 per cent acceptance for 
items included and not included in 
the T scale (computed from 6, Table 
III, p. 203): 

Direct comparisons with American 
samples are not available. Although 
a sprinkling of communists was in- 
cluded in the samples described in 
The Authoritarian Personality, their 
F-scale scores were not reported. 
However, their scores on the E scale 
were given as well as the correlation 
between the E and F scales in the 
samples in which these communist 
subjects were included. Upon the 
basis of this data it is clear that these 
communist subjects scored extremely 
low on the F scale (see 4, pp. 130 
133, for a fuller discussion as well as 
the congruence of such a conclusion 
with earlier work with communist 
responses on Stagner’s measure of 
fascism). 

We are therefore in complete dis- 
agreement with Eysenck’s conclu- 
sions that the F scale: (a) measures 
authoritarianism" instead of 
tential fascism, or (6) that commu- 
nists make higher F-scale scores than 
samples of members of less extreme 
political groups. The only support 
for his position comes from the in- 
credibly low F-scale score purport- 
edly made by Coulter's “‘neutral” 
group. By using this score as a base 
and ignoring the implications of his 


po- 








EYSENCK'S TREATMENT OF THE 


own data and the research of others 
he arrives at a conclusion which we 
believe to be untenable. 


FURTHER CONSIDERATIONS 


It is clear that Eysenck’s commu- 
nist samples are neither ‘“tough- 
minded” nor ‘authoritarian’ when 
the data produced as evidence by 
Eysenck are carefully examined. Our 
analysis clearly indicates that com- 
munists respond to T-scale items 
simply in terms of the radical-con- 
servative loading and not in tough- 
or tender-minded fashion. This is a 
graphic illustration of the danger in- 
herent in assuming, as Eysenck ap- 
parently did, that a scale which pre- 
sumably measures one thing in a 
“normal” population (in the statisti- 
cal sense) measures the same thing in 
a radically different population. 

The point may be clarified by con- 
sidering a specific example. Item 10, 
“It is wrong that men should be per- 
mitted greater sexual freedom than 


women by society’ may be inter- 


preted in alternate ways. It may be 
that the “wrong” is based upon the 
premise that neither men nor women 
should be allowed sexual freedom; 
violation of this standard is therefore 
“wrong.”’ This is apparently the in- 
terpretation of the item made by 
Eysenck's basic middle-class sample 
since the factor analysis of their re- 
siaanses placed the item on the ‘‘ten- 
der-minded”™ side, which is charac- 
terized by acceptance of religious and 
ethical items. An alternative inter- 
pretation is also possible. The item 
might be accepted by those who be- 
lieve that both men and women 
should be allowed sexual freedom 
and it is ‘‘wrong" to restrict the sex- 
ual freedom of women. This, it is 
suggested, might lie behind the fact 
that the communists were most ac- 
cepting of this ‘“‘tender-minded”’ item 
(see 12, Table I). This possible ex- 


PERSONALITY OF COMMUNISTS 427 
planation is supported by the fact 
that the communists sampled were 
much more approving of companion- 
ate marriage (Item 29) than any 
other group. 

Rokeach and Hanley’s argument 
that Ferguson's “‘religionism’’ and 
“humanitarianism” factors account 
for Eysenck’s data better than the 
R and T factors is convincing. This 
can be easily demonstrated for T- 
scale items by an examination of Fig. 
1. However, it is also difficult not to 
appreciate the clear-cut radical-con- 
servative axis that appears in Ey- 
senck’'s data and to agree with 
Eysenck that there are semantic ad- 
vantages in using R and T when deal- 
ing with political parties. It is con- 
tended that what weakens Eysenck’'s 
position is the fact that he has no 
items which are relatively pure meas- 
ures of T. It is further argued that 
this is a direct consequence of his 
original procedure. 

Eysenckoriginally collected, ‘From 
a total of some 500 items, all those 

. which had been shown to be of 
importance or relevance in any previ- 
ous research. When pruned of dupli- 
cations, it was found that the items 
lid not suffice to make up the mini- 
mum number considered requisite, 
and others were added by random 
selection until 40 items altogether 
had been chosen” (8, pp. 121-122). 
It isextremely difficult to believe that 
the 40 items used exhaust the range 
of possibly relevant or important so- 
cial attitudes (see 8, Table XVIII, 
pp. 122-124). 

It is therefore pertinent to ques- 
tion the consequences of Eysenck’'s 
original item selection procedures. 
If, instead of taking items which had 
been of relevance in previous research, 
he had analyzed the definition of 
tough-mindedness and then selected, 
invented, or modified items which 
appeared relevant, and then factor 





428 


analyzed responses to them and 
other items, he might well have iso- 
lated a much purer dimension of 
“tough—tender-mindedness."" Such 
a comment implies that there are 
such items or they might be found. 
Machiavelli makes many statements 
that are tough-minded, to say the 
least, but are not concerned with sex, 
religion, nor punitive reflections upon 
man (as are Eysenck’s tough-minded 
items). Whether a tough-mindedness 
scale could be constructed whose 
items are relatively independent of 
radicalism-conservatism or not, is an 
empirical question. 

Although this is as yet an unre- 
solved problem it has a great deal to 
do with what is a key hypothesis in 
Eysenck’'s theorizing: . there is 
in truth only one ideological factor 
present in the attitude field, namely 
that of Radicalism-Conservatism. 


The T-factor itself does not consti- 
tute an alternative ideological system 
but is rather the projection on to the 


social attitude field of a set of person- 
ality variables” (8, p. 170). It is sug- 
gested that Evsenck was forced into 
the above position as a consequence 
of his original selection of items which 
did not cover aspects of tough- and 
tender-mindedness which were rela- 
tively independent of radical-con- 
servatism. Jf such items exist, then 
he might have found two ideological 
factors, the one radical-conservative 
and the other a means-ends dimen- 
Does one take an amoral atti- 
tude in implementing political ideol- 
ogy (be it radical or conservative) or 
is there a concern with ethics and 
principles? 

It is of interest to note what per- 
sonality variables Eysenck believed 
were relevant to political attitudes. 
He suggests, . ‘tough-minded- 
ness’ is a projection on to the field of 
social attitudes of the extraverted per- 
sonality type, while ‘tender-minded- 


sion. 





RICHARD CHRISTIE 


ness’ is a projection of the introverted 
personality type” (8, p. 174). Thus 
Eysenck believes that communists 
and fascists are extraverted whereas 
conservatives and socialists are in- 
troverted. Evidence comes trom 
Coulter's study in which TAT rat- 
ings on extraversion gave the com- 
munist and fascist samples a higher 
score than members of the “neutral” 
group (8, p. 180). The dangers of 
comparisons utilizing scores made by 
the latter group have already been 
indicated. 

Eysenck also cites an unpublished 
study by George in which ccerrela- 
tions between introversion-extraver- 
sion and T were found. There was no 
marked relationship with R. Neither 
R nor T was related to Eysenck’s 
other personality factor of neuroti- 
cism (8, pp. 177-179). 

It is impossible to confirm or deny 
Eysenck's hypotheses in any conclu- 
sive fashion upon the basis of avail- 
able data. As an alternative it is sug- 
gested that both radicalism and a 
true tough-minded amoral syndrome 
might well be related to personality 
factors but that the relationships de- 
pend upon the social setting. Any 
attempt to relate personality varia- 
bles to political ideology without tak- 
ing the social context into account is 
apt to be highly misleading as well as 
an oversimplification of some highly 
complex interrelationships. Thus in 
a study of communist defectors, 
Almond (2) reports marked differ- 
ences between middle-class and work- 
ing-class members in the patterns of 
motivation leading to their 
into the party. The former, on the 
basis of analyses of interviews, were 
characterized by a high incidence of 
neuroticism, the latter were 
These personality differences were 
also related to the type of role played 
in the party since the screening and 
training of party members led to 


entry 


not. 








EYSENCK'S TREATMENT OF THE 
quite marked role differentiation. 
Those who became communist elites 
were quite different personality-wise 
from those who did not. Almond 
presents a convincing argument indi- 
cating that different sorts of indi- 
viduals are attracted the Com- 
munist Party in different countries, 
at different historical periods (before 
and during the Popular Front pe- 
riod), that in some countries minority 
members are attracted and in others 
they are not, and a host of relevant 
social and historical factors are oper- 
ative in causing people to join the 
Communist Party. 

Any simple statements about the 
“communist personality’’ can fairly 
be said to reflect a lack of apprecia- 
tion for the complex social processes 
involved in ideological deviance. This 
to that members of the 
Communist Party are mot unique 
ilong certain personality dimensions. 
This is not to say that communists 


to 


is not 


say 


and fascists have no personality char- 
acteristics in common which differ- 


entiate them from the ‘“‘political 
normal” population. The point be- 
ing emphasized is that there is a wide 
range of diversity among members of 
communist parties and 
anv broad generalizations about the 
characteristics of communists and 
fascists which are based upon limited 
samples are highly suspect. 

Despite profound disagreement with 
Eysenck's methodological capricious- 
his restricted theoretical 
position, there are some valuable in- 
sights which can be derived from a 
critical analysis of his data. It is ap- 
parent that communists differ from 
others in the importance of the radi- 
cal-conservatism dimension in_ re- 
sponding to items. It is also clear 
that he has provided, albeit unwit- 
tingly, compelling evidence that the 
F scale actually measures fascistic 
ideology. 


and fascist 


and 


hess 


PERSONALITY OF COMMUNISTS 429 

What is especially interesting about 
Eysenck’s data is the fact that it 
clearly refutes any notion that com- 
munists are mirror images of fascists. 
The communists sampled are mark- 
edly different not only from adher- 
ents of the major political parties 
but from fascists as well. The gen- 
eralizability of what has been in- 
formally called the ‘“‘Budenz-Bentley 
syndrome” (authoritarians of the 
right and left are similar so it is easy 
to switch from one extreme to the 
other) is not supported. It should be 
noted that Almond’s data also refute 
this hypothesis. He found that only 
10 per cent of his sample of com- 
munist defectors became _ religious 
converts or returnees, members of 
the extreme right, or conservatives. 
The majority (53 per cent) became 
moderate socialist or trade unionists, 
6 per cent remained on the extreme 
left, 18 per cent were politically indif- 
ferent, and the remaining 13 per cent 
were classified “other” or ‘“un- 
known” (2, Table 15, p. 357, and 2, 
p. 357). 


as 


The present critique has focused 
upon Eysenck’s treatment of com- 
munists and fascists along the dimen- 
sions of tough-mindedness and au- 
thoritarianism. It would be grossly 
unfair as well as misleading to imply 
that Eysenck considers these samples 
as similar on al] other dimensions. He 
cites Coulter's of TAT 
protocols in which it was found that 
the correlation between ratings of 
direct vs. indirect aggression was 
—.94 among the communists sam- 
pled and +.61 among the fascists 
sampled (8, p. 205). 

Since Coulter's thesis has not been 
published, a more detailed methodo- 
logical analysis is inappropriate at 
the present time. Her research is of 
interest, however, not only because 
of the many striking relationships 
found (as the magnitude of the cor- 


analysis 





430 RICHARD 


relations cited by Eysenck indicates) 
but also because she utilized a bat- 
tery of diversified instruments. Criti- 
cism of the use of a “neutral” group 
of a highly atypical nature as a basis 
for comparisons does not necessarily 
imply that Coulter's actual findings 
are not valuable. 


SUMMARY 


Eysenck’s treatment of the per- 
sonality of communists has been sub- 
jected to detailed analysis in the pre- 
ceding pages. It is concluded that: 

1. The samples studied are not 
representative of the parent popula- 
tions, that there is differential bias 
in the sampling of various groups, and 
that generalizations drawn from these 
samples are therefore unwarranted. 

2. The ‘‘tough-mindedness” scale 
leads to misleading comparisons 


among members of various political 





CHRISTIE 


parties because of biases built into 
the scoring system. Further, the T 
scale clearly does not measure tough- 
mindedness among the communists 
sampled since they responded to in- 
dividual items in terms of their radi- 
cal-conservative loading. 

3. The contention that commu- 
nists are “authoritarian as measured 
by the F scale is unjustified since it 
is based on the comparison of a com- 
munist sample with a highly aberrent 
“neutral” group. 

4. Procedures which are utilized 
to differentiate communists and fas- 
cists from other samples are highly 
irregular and violate the data, 

5. A re-examination of the data 
indicates that the communists and 
fascists sampled differed from one 
another in crucial aspects as well as 
being different from the various com- 
parison groups sampled. 


REFERENCES 


. Aporno, T. W., FRENKEL-BRUNSWIK, 
Etse, Levinson, D. J., & SANFORD, 
R. N. The authoritarian personality. 
New York: Harper, 1950. 

. Atmonp, G. A. The appeals of com- 
munism. Princeton: Princeton Univer. 
Press, 1954. 

. CENTERS, R. The psychology of social 
classes. Princeton: Princeton Univer. 
Press, 1949. 

. Curistig, R. Authoritarianism _ re- 
examined. In R. Christie & Marie 
Jahoda (Eds.), Studies in the scope and 
method of the authoritarian personality. 
Glencoe, Ill.: Free Press, 1954. 

Eysenck, H. J. Primary social attitudes: 
1. The organization and measurement 
of social attitudes. Int. J. Opin. Atti- 
tude Res., 1947, 1, 49-84. 

Eysenck, H. J. Primary social attitudes 
as related to social class and political 
party. Brit. J. Sociol., 1951, 2, 198-209. 

. Eysenck, H. J. The scientific study of 
personality. London: Routledge & 
Kegan Paul, 1952. 

. Eysenck, H. J. The psychology of politics. 
London: Routledge & Kegan Paul, 
1954. 

. HANDLON, Brrromar J., & Squier, L. H. 
Attitudes toward special loyalty oaths 


at the University of California. Amer. 
Psychologist, 1955, 10, 121-127. 

10. Hyman, H. H., & SuHeatstey, P. B. “The 
authoritarian personality’’—a method- 
ological critique. In R. Christie & 
Marie Jahoda (Eds.), Studies in the 
scope and method of the authoritarian 
personality. Glencoe, Ill.: Free Press, 
1954. 

. RreckeNn, H. W. The volunteer work camp: 
A psychological evaluation. Cambridge: 
Addison-Wesley, 1952. 

. Roxeacnu, M., & Hanley, C. Eysenck's 
tender-mindedness dimension: a 
critique. Psychol. Bull., 1956, 53, 169- 
176. 

. Srourrer, S. A. Communism, conformity, 
and civil liberties. Garden City, N.Y.: 
Doubleday, 1955. 

. Annual abstract of statistics. No. 90. 1953. 
London: Her Majesty's Stationery 
Office, 1953. 

1950 population census report P-E No. 5B. 
Washington: Bureau of the Census, 
1953. 

. Statistical abstract of the United States: 
1953. (Seventy-fourth Edition). Wash- 
ington: Bureau of the Census, 1953. 


Received August 10, 1955 








PSYCHOLOGICAL BULLETIN 
Vol. 53, No. 6, 1956 


THE PSYCHOLOGY OF POLITICS AND THE PERSONALITY 
SIMILARITIES BETWEEN FASCISTS AND COMMUNISTS 
H. J. EYSENCK 
Institute of Psychiatry (Maudsley Hospital) University of London 


To have one’s writings submitted 
to a very detailed and exhausting 
critique in the pages of the Psycho- 
logical Bulletin is a great honor; 
to have this happen twice is some- 
what overwhelming. Before, there- 
fore, replying to Christie’s comments 
(1) I would like to take this oppor- 
tunity of thanking both him and my 
earlier reviewers (6) for drawing at- 
tention to several minor misprints 
in The Psychology of Politics (4). 
While, as will be seen, I cannot agree 
with any of the major criticisms put 
forward, I shall always be indebted 
to them for their painstaking ex- 
amination of the details of my book.! 

It is curious how much alike 


Christie (1) and Hanley and Rokeach 
(6) are in their failure to deal with 


the logical development of the 
theories and experiments outlined in 
this book (4). Psychological theory 
and factorial studies agreed in show- 
ing that the interrelations of social 
attitudes in Great Britain required at 
least two orthogonal factors ér di- 
mensions for their description; these 
factors were labeled R (for radical- 
ism-conservatism) and T (for tough- 
mindedness vs. tender-mindedness). 
Many theoretical and practical rea- 
sons are given why, descriptively, 
these two factors are superior to any 
of the innumerable alternative rota- 
tions which could be made, and 
Christie appears to agree with this 
when he says that it is “difficult not 


1 Some of the points Christie makes have 
already been answered in my earlier reply to 
Rokeach and Hanley (5). The reader may 
like to consult this earlier paper in conjunction 
with the present one 


to appreciate the clear-cut radical- 
conservative axis that appears in 
Eysenck’s data, and to agree with 
Eysenck that there are semantic ad- 
vantages in using R and T when deal- 
ing with political parties.”’ 

Our theoretical position leads us 
to believe that the T factor is the 
projection onto the attitude field of 
the personality dimension of extra- 
version-introversion, in the sense 
that extraverts will have tough- 
minded attitudes, introverts tender- 
minded attitudes. The content of the 
attitudes of extraverts and introverts 
respectively will be determined by 
their position on the radicalism-con- 
servatism axis. It would follow from 
this hypothesis that there should be 
very few, if any, pure T items; tender- 
mindedness and tough-mindedness 
should always appear in conjunction 
with either right-wing or left-wing 
tendencies. This is what we have 
found in actual fact after an examina- 
tion of many hundreds of different 
items. It is very satisfying to find 
hypotheses supported in this way, 
yet oddly enough Christie appears to 
hold the opposite view. He writes 
“It is contended that what weakens 
Eysenck’s position is the fact that 
he has no items which are relatively 
pure measures of T.”’ The fact that 
if many such items could be found’ 
the theory which has been elaborated 
in The Psychology of Politics would 
be, not just weakened, but completely 
disproved, does not seem to occur to 
Christie. He blames our procedure 
of item selection for our failure to 
find pure T items, and says that if 
the writer “had analyzed the defini- 


431 





432 


tion of tough-mindedness and then 
selected, invented, or modified items 
which appeared relevant and then 
factor analyzed responses to them 
and other items he might well have 
isolated a much purer dimension of 
‘tough— tender-mindedness.’ Such 
a comment implies that there were 
such items or they might be found. 
Whether a tough-mindedness 
could be constructed whose 
items are relatively independent of 
radicalism-conservatism or not, is an 
empirical question.” . 

Having attempted for many years 
to do what Christie advocates, and 
having had several students make 
similar attempts, all without success, 
the writer believes that Christie is 
somewhat optimistic. Perhaps if he 


scale 


had himself some practical experience 
in carrying out work of this kind he 
might be less inclined to dismiss the 
concentrated efforts of several people 
over many years in this superficial 


fashion. It is impossible for the 
writer to prove a negative, i.e., to 
prove that such pure T items do not, 
in fact, exist; all that can be done is 
to carry out the search over a long 
enough period and wide enough field 
to make one’s failure to find such 
items convincing evidence to the un- 
prejudiced judge. Christie's critique 
would have gained considerably if he 
had shown some appreciation of the 
methodological position, and even 
more if he had actually succeeded in 
unearthing such items. 

Granted that hitherto no pure T 
items have emerged, and granted also 
that the dimensional analysis of the 
attitude field requires two dimen- 
sions, it is clearly essential for the 
construction of a T scale to use items 
having reasonably high correlations 
with T, and which are selected in 
such a way that their correlations 
with the Rscale balance out. Christie's 





H. J. EYSENCK 


comment on this is that ‘The crucial 
point in an interpretation of Eysenck’s 
results is that the T scale is a some- 
what better measure of R than T. 
The mean loading of T scale items on 
T is .38, on R .48."" The confusion 
evident in this quotation appears to 
invalidate most of Christie's argu- 
ment as far as it relates to the con- 
struction of the T scale. The crucial 
point is that items are selected in 
such a way that if we have two tough- 
minded items one would be a radical, 
the other one a conservative item. By 
adding the two we add the T vari- 
ances and cancel out the R variances. 
As an example of this, let us consider 
an imaginary miniature scale con- 
sisting of two items. The first item 
relating, say, to trial marriages has a 
loading of +.6 on R and +.5 on T; 
the other item relating, say, to the 
death penalty has a loading of —.6 
on Rand +.5o0n T. For the purpose 
of the T scale a ‘Yes’ 
in each case be counted one point. A 
person saying “Yes” 
tions would therefore get a score on 
the T scale of 2, a person answering 
“No” to both questions would get a 
The fact that both 
items have higher correlations with 
R than with T does not mean that 
the sum of the a good 
measure of R. A person high on R 
would “Yes” to the and 
“No” to the second item; a person 
low on R would reverse this. This 
point would seem too elementary to 
discuss in such detail, but as much 
of Christie's critique is based on it, 
it seemed desirable to clear it up. 
Rokeach and Hanley appear to be 
subject to a similar error of interpre- 
tation. If Christie were, in fact, cor- 
rect in his contention that the T 
scale is a good measure of R, then it 
should correlate with the R scale. As 
the studies reported in The Psychology 


answer would 


to both ques- 


score of zero. 


answers 1s 


say first 








FASCIST 


of Politics (4) show, no such correla- 
tions have in fact been observed. 
The writer would readily admit 
that our first version of the T scale 
fell short of perfection in several 
respects; this was one reason why 
an improved version was construct- 
ed by Melvin (9). However, Christie 
is in the unfortunate position that 
if we completely accepted his criti- 
cism of the scoring system adopt- 
ed, then our results would support 
even more strongly our hy- 
pothesis, and go counter to his. He 
that “by virtue of an 
distribution of items 
with Evsenck’s singular 
scoring system, a hypothetically con- 
sistent fascist is automatically made 
more ‘tough-minded’ by one point 
than a hypothetically consistent com- 
munist."" As we have throughout 
found communists to be slightly less 
tough-minded than fascists, Christie's 
argument would suggest that, in fact, 
we should increase the communists’ 
one point, thus making 
them even more like the fascists than 
As Christie's 
appears to be that 
communists are not tough-minded at 
all, and are quite unlike fascists in 
this respect, acceptance of his criti- 
cisms of our scoring system would, 
therefore, our position 


own 


maintains 
asvmmetric 
combined 


scores by 


appears in our results. 


main argument 


strengthen 
and weaken his. 

The same is true when we look at 
another comment. Christie main- 
tains that “the arbitrary system of 


scoring which treated zero responses 
as ‘tough-minded’ thus introduced a 
bias of unknown extent in the direc- 
tion of making the members of the 


three major 


minded,’ 


parties more ‘tough- 
relatively speaking, than 
those of the two deviant parties.”’ 
Again, even if Christie's criticism 
well taken, it would merely 
that we had loaded the dice 


were 
mean 


AND COMMUNIST PERSONALITY SIMILARITIES 


433 


against our own hypothesis; making 
the appropriate corrections would 
make our results support our theory 
even more strongly. 

Another criticism of the scoring 
system the writer does not under- 
stand at all. Christie maintains that 
“the T scale simply does not apply 
to communists (or at least to this 
sample). Comparisons of scores made 
by communists on a scale on which 
they do not respond along the con- 
tinuum measured with scores by 
other samples are meaningless."’ Just 
what is meant by saving that a cer- 
tain scale ‘simply does not apply” to 
a certain group? One might imagine 
that it would have zero, or at least 
quite low reliability for that group; 
yet Coulter has shown that the relia- 
bility of the T scale is higher for the 
communists than for fascists, or our 
neutral group (2, p. 43). Does it, 
perhaps, mean that our measurement 
of T is only a watered-down and less 
reliable measure of R? The relia- 
bility of the T scale for communists 
is higher than that of the R scale, 
and the two scales do not correlate. 
Does it, perhaps, mean that T does 
not correlate with other variables in 
the case of communists, while it does 
so in the case of fascists and other 
groups? Again, Coulter (2) has 
shown that the opposite is true, if 
anything. Is it that scores on the 
scale do not behave in conformity 
with firmly grounded theory? But 
here again, as shown in The Psychol- 
ogy of Politics (4) and the more re- 
cently concluded study by Nignie- 
witzky (10), to be discussed below, it 
is found that communists behave pre- 
cisely in the predicted manner. 

Is it that the T scale is irrelevant to 
political party structure as compared 
with the R scale? Here, Nigniewit- 
zky's finding on a _ representative 
sample of the French middle-class 





434 


population is relevant; he finds that 
the T scale, while independent of the 
R scale statistically, is actually su- 
pertor to the R scale in differentiating 
between members of the different 
political parties (including the com- 
munists) (10). It is submitted, 
therefore, that Christie's statement is 
strictly meaningless. If Christie had 
quoted the relevant statistical find- 
ings, this fact would have become 
apparent immediately. 

We must now turn to the problem 
of sampling. Christie spends a con- 
siderable amount of space in trying 
to show that our middle-class sample 
was ‘‘completely unrepresentative of 
the British middle class." As the 
writer himself has stressed this point 
several times, Christie’s work ap- 
pears to be a task of supererogation. 
As was pointed out in The Psychology 
of Politics (4, p. 127): “Our interest 
lay not in obtaining a representative 
cross-section of the population but in 
comparing different political groups. 
This can best be done by having the 
groups of equal size, thus reducing 
sampling errors to a minimum. If 
mean values are wanted for the total 
population, then mean values for the 
selected groups can be multiplied by 
the proportions these groups form of 
the total population, thus giving an 
adequate indication of population 
values.”’ Again, Christie appears to 
doubt this statement:—‘'In view of 
the fact that Eysenck’s basic middle- 
class sample is markedly unrepre- 
sentative of the British middle- 
classes, it would be highly dangerous 
to project their attitudes to obtain 
an estimate of the parent popula- 
tions.” 

It may be tedious to the reader to 
spell out this point in detail because 
of its quite elementary nature, but 
as Christie has devoted so much 
space to it, his misinterpretation 





H. J. EYSENCK 


requires correction. If we are in- 
terested in the variance contributed 
to a given score by a number of fac- 
tors, such as political party, sex, age, 
and education, then the most efficient 
design for giving us such informa- 
tion is obviously one in which all the 
possible groups into which these four 
methods of classification divide the 
population are represented in equal 
number. A representative sample of 
the population would be relatively 
inefficient, particularly when some of 
the groups (liberals, university-edu- 
cated) comprise only a very small 
portion of the population. Mean 
values from such an analytic sample 
cannot, of course, be taken as repre- 
sentative of the whole population; 
we would require to correct the fig- 
ures obtained for each subgroup by 
taking into account the proportion 
of people in that special group in the 
total population. When this is done 
we obtain an estimate of population 
parameters which is only a little in- 
ferior to one obtained from a random 
sample. Thus, an analytic sample is 
vastly superior to a random sample 
with respect to the analysis of the 
influence of different factors, and is 
very little, if at all, inferior to it with 
respect to obtaining estimates of 
population parameters. As our pur- 
pose was not that of obtaining popu- 
lation parameters but of determining 
the relative influence of the factors 
indicated, Christie's argument ap- 
pears to be quite irrelevant to the 
facts of the situation. 

It should not be assumed from this, 
however, that our sampling proce- 
dures are not subject to criticisms 
on any point. We know of no com- 
plex study in social psychology which 
has handled this problem with com- 
plete adequacy, and we have through- 
out been aware of certain weaknesses 
in our sampling procedures. The de- 








FASCIST AND COMMUNIST PERSONALITY SIMILARITIES 


tails have always been given in sufh- 
cient detail to enable the reader to 
form his own views as to the degree 
to which our conclusions should be 
modified because of these imperfec- 
tions. In this our writings are in de- 
cided contrast to Christie's own cri- 
tique. He seems to be quite happy to 
establish a point by referring to work 
carried out by Rokeach in which 
scores are given for groups of stu- 
dents and Vauxhall Motors workers 
without any mention at all of sex 
composition, method of sampling 
used, and so forth. Critics who cavil 
at the relatively full data presented 
in respect to the sampling pro- 
cedures used by the writer might be 
expected to heed their own advice. 
In view of Christie's failure to give 
any details at all, the writer cannot 
take seriously the means presented, 
or the criticisms based on them. 

It is forthinate that quite recently 
it has become possible to carry out a 


large-scale study in France, making 
use of a properly selected representa- 
tive sample of the French middle- 


class population. This study was 
carried out by R. Nigniewitzky (10) 
and gave results which are of consid- 
erable relevance to Christie's re- 
marks. Communists on the new and 
improved form of the T scale were 
found to have a mean score of 10.3; 
fascists to have a mean score of 10.2; 
communist fellow-travelers had a 
mean score of 10.2. The mean score 
of the supporters of all the other main 
French parties was 17.6. Commu- 
nists and fascists again appear as very 
much more tough-minded than the 
democratic parties. 

These results are important for 
several reasons. Christie takes us 
to task for selecting communists and 
fascists who were actively engaged 
in political work, and comparing 
them with people who voted for the 


435 


main three parties, but were not 
specially active in the political world. 
This, he maintains, introduced a 
sampling bias because differences 
may be due to the factor of being 
politically active rather than to being 
procommunist or profascist. This 
argument is almost impossibie to 
disprove because in England mem- 
bers of the communist party and 
communist adherents — generally 
are all characterized by this strong 
degree of political activation; it 
would be practically impossible to 
find communists and fascists not 
active in this way, and if any 
could be found they would be ex- 
tremely atypical. Conversely, the 
typical conservative, liberal, or so- 
cialist voter or party member, how- 
ever strong his convictions, does not 
indulge in the same kinds of activities 
as does the communist or fascist. It 
would, therefore, be not just difficult 
but impossible to find conservatives, 
liberals, or socialists carrying out, 
with equal intensity, the kinds of 
things done by communists and fas- 
cists, and again, if such people could 
be found they would be extremely 
atypical. Christie argues ‘In short, 
to what extent are differences in at- 
titudes between communists and 
major party members traceable to 
ideology per se and to what extent 
to other factors relating to political 
activity?’ It would, indeed, be in- 
teresting to know the answer to this 
question, but only someone excep- 
tionally ignorant of conditions in 
Britain at the moment would expect 
it to be possible to find the answer 
in this country. 

There are other difficulties which 
make any ordinary kind of sampling 
procedure inapplicable in England. 
The number of fascists and com- 
munist party members in the whole 
country is usually considered to be 





436 H 


less than 100,000; thus, it would take 
a sample of 300-400 people to find a 
single communist or fascist. To get 
even the relatively small number of 
86 communists and fascists which 
formed our sample, it would require 
a random sample of some 25,000 peo- 
ple. When to this is added the secre- 
tiveness of fascist party members, 
who usually refuse to answer ques- 
tions, and the contempt of commu- 
nists for this type of work, and their 
consequent aversion to taking part in 
it, the impossibility of using orthodox 
methods should be even more obvi- 
ous. As if all this were not enough, 
there is in addition the difficulty that 
if one were not to make party mem- 
bership the criterion for acceptance 
of a person as being a communist or 
fascist, one would be left with no cri- 
terion at all. In the case of the major 
parties, identification was based on 
voting behavior. This is not applica- 
ble to the fascists as there were no 
fascist candidates during the elec- 
tion, and it is hardly applicable to 
the communists because communist 
candidates were standing only in a 
very small number of highly atypi- 
cal constituencies. Christie condemns 
our method of sampling; he does not 
indicate how it could have been im- 
proved—even without taking into 
account the limitations imposed by a 
budget which never rose above, and 
frequently fell short of, the sum of 
100 dollars per annum. 

It is here that our French study is 
so important. In France, the com- 
munist party is a mass party, with 
sufficient members of a nonactive 
character to make it comparable to 
other parties, and to make possible 
orthodox methods of sampling. When 
this is done, as has been pointed out 
above, the result shows even more 
striking differences in the predicted 
direction than were found in this 





J. EYSENCK 


country. Thus, an improvement in 
sampling procedures, as demanded by 
Christie, and an improvement in the 
scale used do not result, as would be 
predicted from his criticisms, in a 
lessening of the observed differences 
between communists and the ortho- 
dox political parties; quite on the 
contrary, the ¢ifferences become much 
wider and mitch more significant. 
Christie might well reply that his 
criticisms were concerned with the 
studies reported in 7he Psychology of 
Politics, and that this new study is 
irrelevant. This, however, is not so. 
In all experiments which involve 
sampling, the investigator has to 
make certain decisions as to which 
factors are, and which are not, likely 
to influence the results, and in need 
of experimental control. Similarly, 
the reader has to decide to what ex- 
tent he is willing to accept the inves 
tigators’ judgment and to what ex- 
Even 
the best stratified sampling pro- 
cedure involves a decision as to the 
relevant variables which are to be 
used for the stratification. There are 
grounds here for legitimate disagree- 
ments. No random sampling pro- 
cedure fails to encounter the problem 
of nonresponders; no method of han- 
dling this is beyond criticism. In 
studies like the ones reported in The 
Psychology of Politics, where random 
and stratified sampling could not be 
used in the orthodox manner, deci- 
sions have to be made by the investi- 
gator with which the reader may dis- 
agree legitimately. Only additional 
investigations can settle issues which 
otherwise must remain a matter of 
opinion. In the writer's view, the 
sampling methods used in The Psy- 
chology of Politics, while far from per- 
fect, have adequately substantiated 
the hypothesis under investigation. 
According to Christie they have not. 


tent he is prepared to reject it 








FASCIST AND COMMUNIST PERSONALITY SIMILARITIES 


The only way of deciding is not by 
rather pointless argument, but by 
further experiment.’ It is the writer's 
view that the Nigniewitzky (10) ex- 
periment has settled the issue as far 
as the sampling controversy is con- 
cerned. 

A good deal of Christie’s argument 
isconcerned with findings from Ameri- 
can studies, which he believes con- 
tradict our own findings. He appears 


to believe that relationships between 


social attitudes and personality fac- 
tors depend upon the social setting. 
“Any attempt to relate personality 


2 One of the criticisms made by Christie may 
n example of the kind of point on 
which legitimate disagreements might arise. 
The writer, having found that certain con- 
correlated with T in 
did not consider it 


serve aS a 


trols, such as age, were ur 
his middle-class sample, 
controls on his 
working-class sample as this would have made 


necessary to impose these 
the investigation very much more expensive 
and cu Christie argues that while 
controls were irrelevant in the middle-class 
there is no proof that they were ir- 
relevant in the working-class sample, and that 
consequently the controls should have been 
retained 


mbersome 


sample 


This is a possible point of view. It 
would be if all 
be con- 
As this is 


impossible, judgments have to be made as to 


more satisfactory 
could 


trolled in experiments of this kind 


sources of variation 


the relative importance of different aspects of 
the investigation. In the absence of any evi- 
it seemed unlikely to 
I on the 
, on the other would be 


dence to the contrary, 

the writer that correlations between 
one hand and age 
so very dissimilar in a working-class group as 
compared with a nitédle-class group. Christie 
quotes some evidence to show that relation- 
ships between attitudinal variables are differ- 
ent in middle- and working-class samples, but 
that, of course, is quite a different point; we 
are here concerned with correlations between 
factor scores and control variables. It may be 
said, in parentheses, that in recent unpub- 
lished work we have found relationships be- 
tween T and the various control variables to 
much the same it 
middle-class samples 


be ver. working-class as in 
This does not, of course, 
invalidate the principle of Christie's criticism; 
it merely illustrates that a criticism may be 
abstractly legitimate without being neces- 
sarily damaging to the conclusion arrived at. 


437 


variables to political ideology with- 
out taking the social context into ac- 
count is apt to be highly misleading 
as well as an oversimplification of 
some highly complex interrelation- 
ships.’” The reader might not guess 
it from Christie's comments, but this 
is almost precisely what the writer 
himself has pointed out in his book 
This is what he has to say. After 
pointing out that most of the work 
contained in The Psychology of Poli- 
tics was carried out in England, he 
goes on to say that “results from 
Germany and Sweden, as well as 
from the U.S.A., make it seem likely 
that the main conclusions drawn here 
would apply equally well there; it 
would not be wise, however, to gen- 
eralize too far. . This is particu- 
larly important when considering the 
personality structure of members of 
groups such as the fascist and com- 
munist parties. In our culture, these 
are minority groups; it is unlikely 
that conclusions based on members 
of such groups could be transferred 
without change to members of the 
Communist Party in the U.S.S.R., or 
to members of the former N.S.D.A.P. 
in Germany. When we talk about 
communists and fascists, therefore, it 


1s about British communists and 


fascists we are talking, not about their 
foreign 


prototypes. At times the 
reader will undoubtedly be tempted 
to generalize beyond this restriction; 
if he does, he does so at his own peril” 
(italics not in original). Many of 
Christie's arguments and criticisms 
are based on assumed similarities be- 
tween English and American condi- 
tions. He is free to indulge in these 
speculative exercises, but the writer 
should make it clear that they have 
little relevance to his own writings or 
views. Attempts have been made to 
extend our work to other countries 
like Spain (11), France (10), Sweden 








438 H. J. EYSENCK 


(7), Germany (3), the Near East (8), Christie delights. The reader of these 
and so forth; the accumulation of detailed reports may form his own 
facts would appear more important views regarding the degree of cul- 
than the armchair theorizing in which tural dependence of R and T. 


REFERENCES 


1. Curistre, R. Eysenck's treatment of the Attityder. Personalen ich Foretaget., 
personality of communists. Psychol. 1951, 80-87. 


Bull., 1956, 53, 411-430. 

. Coutter, T. An experimental and sta- 
tistical study of the relationship of 
prejudice and certain personality vari- 
ables. Unpublished doctors’ disserta- 
tion, Univer. of London, 1953. 

. Eysenck, H. J. Primary social attitudes 
a comparison of attitude patterns in 
England, Germany, and Sweden. J. 
abnorm. soc. Psychol., 1953, 48, 563- 
568. 

. Eysenckx, H.J. The psychology of politics. 
London: Routledge and Kegan Paul, 
1954. 

. Eysenck, H. J. The psychology of peli- 
tics: a reply. Psychol. Bull., 1956, 53, 
177-182. 

. Haney, C. & Roxeacu, M. Care and 
carelessness in psychology. Psychol. 
Bull., 1956, 53, 183-186. 

. Husen, T. Miatningar av Intressen och 


8. KEEHN, J. D. An examination of the two- 
factor theory of social attitudes in a 
Near-Eastern culture. J. soc. Psychel 
1955, 42, 13-20. 

Melvin, D. An experimental and statisti- 
cal study of two primary social atti- 
tudes. Unpublished doctor's disserta- 
tion, Univer. of London, 1955 

. NiGNrEwiTzky, R. W. A statistical and 
experimental investigation of rigidity 
in relation to personality and social 
attitudes. Unpublished doctor's disser- 
tation, Univer. of London, 1956. 

. Pinttios, J. L. Attitudes sociales pri- 
marias. Rev. Univer. Madrid, 1953, 11, 
No. 7, 367-399 

. Roxeacu, M., & Haney, C. Eysenck's 
tendermindedness dimension a 
critique. Psychol. Bull., 1956, 53, 169 
176. 


Received January 5, 1956 








PSYCHOLOGICAL BULLETIN 
Vol. 53, No. 6, 1956 


SOME ABUSES OF PSYCHOLOGY! 


RICHARD CHRISTIE 
Columbia University 


Eysenck's reply (11) to a method- 
ological critique (4) of his writings on 
personality and politics appears, at 
best, to be lacking in candidness. A 
number of specific criticisms were 
made of his work. He does not refer 
to many of these. Others he at- 
tempts to evade by distorting the 
original criticism and giving irrele- 
vant answers. This is a serious ac- 
cusation. It may be best evaluated 
by summarizing the original criti- 
cisms and then considering his re- 
sponses, if any, to them. 

One initial comment should be 
made. Eysenck says that the criti- 
failed to ‘*...deal with the 
logical development of the theories 
and experiments outlined [in The 
Psychology of Politics (9)]...”’ (11, 
p. 431). The reason for not taking 
Eysenck's theories seriously is sim- 
ple. Their basis is essentially an in- 
ductive one.2— They primarily rest 
upon data collected by Eysenck and 
his students. Other material which 
lends support is cited; that which is 
contradictory is slighted or ignored 
(3). Errors in the collection, process- 
ing, or analysis of data on the part of 
Eysenck and his collaborators are 
therefore extremely relevant for the 
validity of his theories. 

Although the temptation to rise to 
some of Eysenck’s more irrelevant 
remarks is tantalizing, scientific criti- 
cism is best served by returning the 
argument to the level of fact. The 


: The title of this paper is based, appropri- 
ately enough, upon that of a book by H. J. 
Evsenck (8). 

2 This is not intended as a critical remark. 
The writer is favorably disposed toward a 
truly inductive approach at the present state 
of the development of social psychology. 


cism 


procedure followed in the critique 
of systematically evaluating method- 
ological flaws in Eysenck’s work will 
be followed for the sake of simplicity 
and comprehensiveness. 


THE T SCALE AND ‘“‘ToUGH- 
MINDEDNESS”’ 


Sampling. Ananalysisof Eysenck’'s 
samples of middle-class supporters of 
various political parties was made. 
It was concluded that they were non- 
representative, as evidenced by gross 
discrepancies in age and education 
between them and estimates of the 
British middle class based upon Brit- 
ish census data. Eysenck agrees with 
this conclusion but regards a syste- 
matic attempt at indicating the ex- 
tent of the bias as a ‘‘task of super- 
erogation’ (11, p. 434). 

Evsenck was not criticized for us- 
ing available samples or for advocat- 
ing properly conducted analytic sam- 
pling. He was criticized for main- 
taining that scores of, e.g., univer- 
sity-educated, older, middle-class, 
male Liberals, as sampled by him, 
could be projected to the parent pop- 
ulation of individuals meeting these 
criteria in Great Britain. 

It was pointed out that Eysenck’'s 
students tended to collect question- 
naires from individuals who were pre- 
sumably most like them, i.e., young 
and highly educated. At best, one of 
twenty among the British middle- 
class population have had a smatter- 
ing of university education, well over 
half of Eysenck’s sample have so ben- 
efitted (4, p. 415). Half of his sample 
were under 30 years of age as con- 
trasted with but a fifth of the British 
adult population (4, p. 414). 


439 





440 RICHARD 


Furthermore, Eysenck’s students 
gave questionnaires to friends or 
acquaintances. At best, the parent 
population of his samples can be de- 
fined as consisting of only those indi- 
viduals who were known by students 
of Eysenck’s. Strictly speaking, sta- 
tistical generalizations cannot be 
made to even this highly restricted 
parent population since there is no 
evidence that there was random selec- 
tion of respondents within this pool 
of potential subjects. Lindquist (14, 
pp. 73-74) has a clear discussion of 
the pitfalls involved in generaliza- 
tions based upon nonrandomly drawn 
samples. 

There is therefore no justification 
for Eysenck's suggestions (5, p. 57, 
9, p. 127) that the test scores of his 
admittedly nonrepresentative sam- 
ples can be projected to obtain a 
meaningful estimate of scores of the 
British population. 

No evidence whatsoever was given 


as to how Eysenck selected working- 
class respondents of the major par- 


ties. The inference was drawn that 
these were also collected by students 
from among their acquaintances. 
This is not denied by Eysenck. 
Among the working-class respondents 
known to Eysenck’s students, ques- 
tionnaires were given to 27 Liberals 
(9, p. 137). These were selected 
solely upon the basis of being known 
by Eysenck’s students at London 
University. If anyone maintains that 
the scores of these 27 individuals can 
be meaningfully projected to all 
working-class members of the Liberal 
Party in Great Britain he may be 
even more “exceptionally ignorant of 
conditions in Britain” than this critic. 

It was also pointed out that Ey- 
senck’s communist sample was se- 
lected from an active political or- 
ganization. They, by definition, were 
active group members and in this 





CHRISTIE 


sense differed from the majority of 
the population. The question was 
then asked as to “.. . whether they 
[Communists] are less different from 
those who are politically active in 
major parties than from those who 
merely list themselves in a particular 
way when asked to do so” (4, p. 
413). The question raised was: are 
the T-scale scores of communists 
(by definition, politically active) 
more similar to those of active mem- 
bers of major political parties than 
to those of imactive members of major 
political parties? Eysenck does not 
address himself to this question. In- 
stead he repeats the point made in 
criticism that there are no inactive 
communists and then concludes that 
it follows that the comparison is im- 
possible !? 

Finally, Eysenck’s description of 
Coulter's sample of working-class 
males in the British Army as being a 
“random sample of the British work- 
ing-class’” was questioned. The 
absurdity of this statement was 
pointed out. The test scores made 
by these men were compared with 
Eysenck’s other sample of British 
working-class males and significant 
differences between the mean scores 
of the two groups on the R (radical- 
ism) scale were found. Neither sam- 
ple was representative. Differences 
in test scores indicated they were not 
drawn from the same parent popula- 


% Melvin compared the ‘‘tender-minded- 
ness’’ scores of members of his sample who 
listed themselves as ‘‘active in politics” (no 
definition of activity other than respondents’ 
self-classification) with those sample members 
who did not so list themselves (15, Fig 
following p. 329). No differences emerged. 
Such a finding is directly relevant to the origi- 
nal criticism. If Eysenck had cited it, the bur- 
den of rejoining (that such a criterion is too 
vague) would have fallen upon the critic. 
Instead of citing relevant evidence, however, 
Eysenck perverted the logic of the argument. 








SOME ABUSES OF PSYCHOLOGY 


tion. Eysenck does not produce any 
new evidence to rebut the analysis; 
indeed, he not mention this 
aspect of the criticism. 

Eysenck has not chosen to give a 
rebuttal to any of the specific criti- 
cisms made of either his generaliza- 
tions which were based upon unrep- 


does 


resentative samples or of his compari- 
sous of samples which differed in the 
way they were drawn from the pre- 
sumed parent population. He at- 
tempts to evade the issue by implvy- 
ing that criticism was directed solely 
toward his sampling procedures rather 
than the generalizations based upon 
them. He “We know of no 
complex study in social psychology 
which handled this problem 
with complete  ade- 
(11, p. 434). Although 
the point is 
What is relevant is the 
degree of caution with which gen- 
eralizations are 


Says, 


has 
[sampling] 
quacy 

the 
irrelevant.‘ 


writer disagrees, 


made trom samples 
which cannot be randomly drawn 
trom the parent population. — Al- 
mond’'s study of communist defectors 
2) is an example of scientific re- 
straint in such a situation. 
Measurement. Attention was di- 
rected toward the bias caused by 
Eysenck’s treatment of the ‘“‘no- 
answer” It was pointed 
out that its always being scored as 
‘“‘tough-minded” led to the seemingly 
paradoxical situation where a person 
who had no opinion on anything 
emerged as the epitome of ‘‘tough- 
who 


category. 


mindedness’ whereas someone 


disagreed 


gon of “tender-mindedness.’’ Eysenck 


with everything was a para- 


does not choose to give any rationale 
for the unique procedures which 
could lead to such nonsensical results. 


* Among the studies cited in the references 
listed in the critique, that of Stouffer (18) 
handles the problem of sampling in exemplary 
fashion. 


441 


It was also pointed out that the 
specific comparisons between mem- 
bers of the three major political par- 
ties and the two deviant groups 
(communists and fascists) were af- 
fected by this scoring procedure. It 
was inferred from bits of data pre- 
sented in Eysenck’s writings that 
members of the major parties were 
characterized by a higher frequency 
of ‘‘no-answer’’ responses than mem- 
bers of the two deviant parties. As 
Eysenck correctly notes in his rebut- 
tal, this loaded the dice against his 
“hypothesis."" The amount of error 
introduced by this peculiarity of the 
scoring system is so much less than 
that arising from some of the mis- 
takes in addition (16) and highly 
aberrant methods of analysis (4) that 
it does not materially strengthen 
Eysenck’'s position. This is the sole 
comment Eysenck makes about his 
strange treatment of the ‘‘no-answer”’ 
category, and does not clarify the 
basic issues involved. 

In examining the asymmetric dis- 
tribution of items in the four quad- 
rants of Eysenck’s two-factor space 
it was pointed out the peculiarities 
of the scoring system were such that 
a hypothetically consistent commu- 
nist would automatically be 
“tough-minded” by one point than 
a hypothetically consistent fascist. 
Eysenck’s reply is a model of dis- 
ingenuousness: ‘As we have through- 
out found communists to be slightly 
less tough-minded than fascists, 
Christie’s argument would suggest 
that, in fact, we should increase the 
communists’ scores by one point, 
thus making them even more like the 
fascists than appears in our results” 
(11, p. 433). How the criticism could 
possibly suggest to Eysenck that in- 
creastng communists’ scores by one 
point could equate for the inade- 
quacies of the scoring system is most 


less 





442 RICHARD 


puzzling. The T scale is so scored 
that the higher the score, the greater 
the purported ‘‘tender-mindedness.” 
Adding a point to the scores of com- 
munists would therefore have the 
opposite effect from that postulated 
by Eysenck—1t would increase the dif- 
ference between communists’ and fas- 
cists’ scores on the T scale! Aside 
from this mon sequitur, Eysenck’s 
finding that communists are more 
“tender-minded” (by from roughly 
one to two points depending on whose 
addition is used) is partially due to 
the biases resulting from his scoring 
system. Actually, since his com- 
munist sample did not respond to 
T-scale items as Eysenck hypothe- 
sized they should, the bias leads to 
an unknown error in the scoring sys- 
tem. 

The criticism relating to the asym- 


metric distribution of 14 items in 


four quadrants is not rebutted by 
Eysenck’s discussion of asymmetrical 


two-item scale. This has nothing to 


5 In addition to attempting to evade criti- 
cism by a superficial example, it is important 
to note that Eysenck uses “imaginary” values 
(11, p. 432) for the two items in his example. 
He says, ‘‘The first item relating, say, to trial 
marriage has a loading of +.6 on R and +.5 
on T; the other relating, say, to the death 
penalty has a loading of —.6 on Rand +.5 on 
T” (11, p. 432). The items referred to are 
Nos. 29 and 36 of his version of the T scale 
(9, Table XVIII, p. 123). The actual loadings 
of these items according to Eysenck’s re- 
ported findings are —.53 and +.56, —.60 and 
—.20 respectively (9, Table XX, p. 129). Both 
items are loaded on the radical side of R; the 
first is “tough-minded” and the second 
“*tender-minded."’ Eysenck goes on to savy, “A 
person saying ‘Yes’ to both questions would 
therefore get a score on the T scale of 2, a 
person answering ‘No’ to both questions 
would get a score of zero” (11, p. 432). Ac- 
cording to Eysenck’s scoring system, however, 
they would both get scores of one on the T scale 
but for different reasons. A person accepting 
both would be given a point for his affirmative 
response to Item No. 36; a person rejecting 
both would be given a point for not accepting 





CHRISTIE 


do with the problem of asymmetry 
which exists in both Eysenck’s and 
Melvin’s scales. It may be that this 
is one of the points that Eysenck 
claims to have answered in his reply 
to Rokeach and Hanley (11, Foot- 
note 1, p. 431). There Eysenck says, 
“The T score combines in equal pro- 
portions radical and conservative 
items and thus gets rid of the compli- 
cation introduced by the R_ fac- 
tor...’ (10, p. 180). 

This statement is irrelevant. An 
examination of Fig. 1 in the critique 
(4, p. 419) indicates that seven T-scale 
items are saturated with radicalism 
and seven are loaded cu: the conserva- 
tive end of the axis. This does not 
get rid of complications because of 
the asymmetry of the dispersal of 
the items in the quadrants which was 
the point made in criticism. 

Eysenck nowhere in his writings 
discusses the rationale for having an 
asymmetric distribution of the T- 
scale items (five, four, three, and 
two) in the four quadrants of his fac- 
tor space. Let us accept for a mo- 
ment his ground rules for the distribu- 
tion of items—namely that there 


Item No. 29. 

Further, “A person high on R would say 
‘Yes' to the first and ‘No’ to the second item; 
a person low on R would reverse this"’ (11, p. 
432), indicates Eysenck's confusion about 
items in his own scale and their scoring. A 
person high on R should accept both state- 
ments and one low on R should reject them 
if scored according to his procedure. 

It is suggested that Eysenck is forced to 
resort to the use of “imaginary” values be- 
cause there are not, in fact, items with em- 
pirical loadings on T and R which are so 
balanced that the variances will be canceled 
out as in his example. The actual loadings on 
T and R are extremely crucial to the problem 
of asymmetry of item distribution. Eysenck's 
“imaginary” loadings are not in agreement 
with his findings but are given arbitrary values 
which support the argument which he ad- 
vances in attempting to evade criticism. 








SOME ABUSES OF PSYCHOLOGY 


should be a balance between radical 
and conservative items in the T 
Eysenck objects to ‘‘specula- 
tive exercises’ based upon his data 
and for very good reasons as shall be 
demonstrated. The exercises to be 
presented are designed simply to 
show the absurdities arising from 
taking Eysenck’s approach seriously. 

First, let us assume that a com- 
munist came across Eysenck’s ma- 
terial and accepted his criteria for 
balancing items in the T scale. With 
relatively little study this person 
could deduce that the responses to 
the items in the various quadrants of 
Eysenck's factor space show fairly 
stable patterns of the degree of ac- 
ceptance by members of Eysenck’'s 
samples of adherents of various politi- 
cal parties. If, for some reason, this 
communist wished to 
that middle-class 
sampled by Eysenck) were really 
not ‘‘tough-minded’’—indeed, that 
they “tender-minded” 


scale. 


demonstrate 


his comrades (as 


were more 


than Eysenck’s sample of conserva- 


tives—he could do so very simply by 
deleting one of the four items in the 
radical ‘‘tough-minded” quadrant 
and adding one in the radical ‘‘tender- 
minded” quadrant. Eysenck’s scor- 
ing system does not make commu- 
nists ‘“‘tender-minded” when they ac- 
cept radical ‘‘tough-minded" items 
(as is their wont). It does make them 
“tender-minded” when they accept 
the ‘‘tender-minded” radical items 
(as is their wont). Such a substitu- 
tion of items, which is completely in 
accord with Eysenck’s specifications, 
serves to make the communists more 
““‘tender-minded.”” It also makes the 
conservative sample somewhat less 
“tender-minded.”” Indeed, as shown 
in Table 1, the conservatives are now 
more ‘‘tough-minded”™ than the com- 
munists! 

Let us speculate even further. As- 


443 


sume a _ conservative wishes to 
“prove” that communists are even 
more ‘‘tough-minded” (especially 
when compared to conservatives) 
than Eysenck’s figures indicate and 
that he has caught on to the fine art 
of juggling items. Staying within the 
ground rules, he then deletes three 
items from the conservative ‘‘tough- 
minded” quadrant (to which some 
of Eysenck’'s sample 
were receptive) and adds three items 
to the conservative ‘‘tender-minded” 
quadrant (which conservatives, ap- 
propriately enough, accept). The re- 
sults of such a manipulation make 
the conservatives even less ‘“‘tough- 
minded” than the communists than 
is done by Eysenck’s own procedure 
as indicated in Table 1. 

The preceding comparisons could 
have been made even more grotesque 
if the and substitution of 
items had been based upon the ac- 
tual responses to individual items as 
reported by Eysenck (7, Table III, 
p. 200) rather than by manipulating 
the means of items in the quadrants. 

The point made by these ‘‘specula- 
tive exercises’’ is simple. The re- 
quirements for the T scale which 
Eysenck has stipulated are meaning- 
less. His original scale was so con- 
structed and scored that the means 
reported for those affiliated with vari- 
ous political parties (even when his 
arithmetic is corrected insofar as 
possible) represents the peculiarities 
of the biases built into it rather than 
the positions of respondents along 
any meaningful measure of ‘“tough- 
mindedness.’’* 


conservative 


deletion 


6In many ways it would have been more 
logical to have the T-scale items equally dis- 
ributed among the four quadrants. If this 
had been done, certain interesting conse- 
quences for Eysenck'’s speculations would 
follow. Computations along the line indi- 
cated in Table 1 indicate that in agreement 
with Eysenck’s and Rokeach and Hanley’s 








ww 
ee 
NY 
nH 
~ 
mS 
~~ 
ey 
~ 
- 4 
= 
~ 
tt 
~ 
~~ 
me 





-Jafai St 


wy Aq UdA 


AJOOS aye s I 


[PoIpey 
AAT PAJOSUO ) 
pep Ul-JOpudy ,, 


to 0 O08 } ( : gS AAI PAJOSUO™) 
89°C $7 ; ‘ peoipey 
, pepultu-y sno] ,, 
*suoy "mw0) ‘suo "muO) suo’) "uO ‘suo’ tuWwO-) 
7 = suai] swaq] - sulay] 
AIOIG OF jo ON MOI OF jo ON MOIS OF jo ON 
uorNnqgiyuo,) uolNnqiuyuo’y uolnnqiyuo0’y 
- - - juPspeng) 
swUPIpENC) ut 
Surypssn[ Surpssnf MUN PIMOIg sula}] jo 
$,AAIJRAIISUO-) [BONOY OAT] s ystunuuo’y peonaYyodATy S$ youas4y aourydaooy 
aseyusda UPA 


,ASIONANY AALLVINOAS,, Y 


2. cGd0NIJQ-HONO],, AMO SHAILVANASNO) YO SLSINAWAKOD SSVID-ATdGIy aay :ONITOON[ WAL] NI 
1 ATaVvi 





SOME ABUSES OF PSYCHOLOGY 


It would have been even more ap- 
propriate to the criticism to examine 
the effects ol E-vsenck’'s artifacts on 
the scores of fascists as compared 
with communists and major party 
members. Data on fascists are not 
available. However, it is clear that 
present 
with which Eysenck’'s scoring system 
can be manipulated to produce con- 
tradictory results by 


demonstrations of the ease 


a simple shift- 
items is a sufficient rea- 
for viewing his 
being partially the result of illogical 
and unjustified procedures in scoring 


items. 


ing of scale 


son conclusions as 


An analysis of the 
communists to specific T-scale items 


responses of 


in the four quadrants indicated that 
they acc epted or rejected items along 
a radicalism and conservatism di- 
If an item had a saturation 


on the radical dimension communists 


mension, 


accepted it whether it was ‘‘tough-” 


or ‘‘tender-minded.”’ 
indicated that no prediction could be 
whether communists 


The analysis 


m ide as to 


s Communists sam- 
tough-minded” 
of major political 
di ‘ sing Evsenck's 
quadrants, however, 


would be 


more 


nhe rs 


more 

rking-class Commu- 
Of even interest, however, is the 
great stress Which Eysenck’'s theorizing (9, pp 
259-260) places upon his finding that in every 
samples are 


greater 


political party working-class 
more “tough-minded” than are middle-class 
If the items had been symmetrically 
would have been found that 


ists and Conservatives 


samples. 
distributed it 
working-class Commu 
were less “tough-minded” than middle-class 
followers of these parties The relative posi- 
tion on the T scale of working- and middle- 
class samples of Liberal and Socialist parties 
would remain the same. This is but one exam- 
ple of the effect which artifacts in measure- 
ment have upon Eysenck’s theory building. 


445 


would accept T-scale items if only 
their saturation on T was known but 
that in every case in which knowledge 
of their saturation on R was known 
a perfect prediction could be made as 
to whether communists as a group 
would accept or reject the item. This 
is the basis for making the statement 
that the T scale “does net apply” to 
a certain group, in this case, members 
of the Communist Party. It was, 
perhaps erroneously, believed that 
the statement which Eysenck ‘does 
not understand at all” (11, p. 433) 
was clear in its original context. Evi- 
dently it was not and it shall be 
spelled out. FEysenck’s own data un- 
equivocally indicate that communists 
respond to T-scale items according to 
their saturation on R and not to their 
saturation on T. It was therefore ar- 
gued that the T scale, as scored by Ey- 
senck, does not measure ‘‘tough-mind- 
edness’ among the members of the Come 
munist Party whom he sampled. This 
is the basic point and Eysenck’s dis- 
cussion of the statistical reliability of 
the T scale in Coulter's sample of 
communists (11, p. 433) evades the 
The issue is validity and *not 
reliability. Is it too much to assume 
that Eysenck knows that it is possi- 
ble to have a reliable’scale which has 
no. validity? 

Analysis. In his rejoinder, Eysenck 
does not touch upon the critical com- 
ments made about his use of ‘‘average 
of average” scores rather than using 
conventional statistical techniques. 
He does not justify his arbitrary 
lumping together of fascists and com- 
munists when this violates the data. 
It is therefore not clear why he 
chooses as part of the title of his re- 
joinder, . the Personality Simi- 
larities Between Fascists and Com- 
munists.”’ His data do not suggest 
similarities but rather dissimilarities. 


issue. 





446 RICHARD 


THE F ScALE AND ‘‘AUTHORI- 
TARIANISM”’ 


Eysenck’s statement that the F scale 
measured authoritarianism rather 
than potential fascism was examined. 
It was concluded that such a state- 
ment was completely unjustifiable 
in terms of other research and that 
Eysenck’s own data indicated the 
opposite. The only possible basis for 
such an assertion was the work of 
Coulter in which an alleged politi- 
cally ‘“‘neutral"’ group made an un- 
precedentedly low score on the F 
scale. Evidence was presented which 
clearly indicated that this particular 
sample was a highly aberrant one 
when their other test scores, as re- 
ported by Eysenck, were evaluated. 

Eysenck’s only reply to this por- 
tion of the critique is his refusal 
to acknowledge the relevance of data 
collected in Great Britain by Rokeach. 
He is correct in objecting to a refer- 
ence to material which had (at the 
time the critique was written) been 
submitted but not accepted for pub- 
lication. Since Rokeach’s monograph 
is now in press, the reader will be able 
to evaluate Rokeach’'s finding that 
the British college students sampled 
by Rokeach have lower F-scale scores 
than do his sample of British workers 
(17). 

The reasons for rejecting Eysenck’s 


TABLE 2 

F-ScaLe Scores OF STUDENTS IDENTIFYING 
WITH Various PoLiITIcaL PARTIES* 
Party 

Identification 


Item-mean 
Score 
Conservative 54 .98 
Liberal 22 3.39 
Labor (Atleeites) 27 3.51 
Labor (Bevanites) 19 a2 
Communist 13 86 





17, Table 13, p. 34) 


* Item means computed from 


CHRISTIE 


argument that communists score high 
on the F scale were indicated in the 
critique. Since other of Rokeach’s 
data bear directly upon this point, 
they are worthy of reproduction. 
Rokeach gave the F scale (1, pp. 
255-257) to students at London Uni- 
versity. The respondents were asked 
to indicate their political preferences. 
Table 2 indicates the results. 

Completely contrary to Eysenck’s 
statement about communist scores on 
the F scale, the communistically in- 
clined students at the University of 
London, where Eysenck teaches, 
scored lowest on the F This 
finding is in complete accord with 
some twenty years of research on 
similar types of scales as was indi- 
cated in the critique. 

It is once concluded that 
Eysenck’s attempted equation of F- 
scale scores and authoritarianism 
based upon Coulter's samples is com- 
pletely out of line with all available 
data including his own. 


scale. 


more 


FURTHER REMARKS 


The previous discussion has dealt 
with the specific points made in the 


critique and Eysenck’'s failure to 
answer them adequately. In evading 
these criticisms, Evsenck raises many 
points which require clarification. 
Two will be singled out for comment; 
his discussion of ‘‘pure”’ 
and his use of research other than his 
own to support his speculations. 
‘Pure’ T-scale ilems. An impor- 
tant aspect of Eysenck’s theorizing is 
that, “The T-factoritself . . . israther 
the projection on to the social atti- 
tude field of a set of personality vari- 
ables” (9, p. 170). As noted in the 
critique, this conclusion of Eysenck’s 
might well have resulted from the 
fact that his procedure of collecting 
items from existing scales could not 
have possibly uncovered ‘‘pure’’ meas- 


T-scale items 








SOME ABUSES OF PSYCHOLOGY 


ures of T since such were not included 
in the scales subjected to pruning. 
It was suggested that, “If, instead 
of taking items which had been of 
relevance in previous research, he 
had analyzed the definition of tough- 
mindedness and then selected, in- 
vented, or modified items which ap- 
peared relevant and then factor an- 
alyzed responses to them and other 
items he might well have isolated a 
much purer ‘tough- 
tender-mindedness’”’ (4, p. 427-428). 

Eysenck apparently agrees with 
this criticism of his original pro- 
cedure since he quotes the statement 
and then says, “Having attempted 
for many years to do what Christie 
advocates, and having had several 
students make similar attempts, all 
without success, the writer believes 
that Christie is somewhat optimistic”’ 
(11, p. 432). 

An examination of Eysenck’'s pub- 
lished work does not indicate a single 
instance of his ever having offered an 
analysis of ‘‘tough-mindedness."" In 
his earliest work in this area the items 
“practi- 
cal-theoretical”’ dichotomy and Ey- 
senck later noted that the interpreta- 
tion of the factor was entirely subjec- 
tive (9, p. 119). It was in the report 
of work done upon the basic middle- 
sample (5) that the terms 
“tough-"" and ‘“‘tender-minded" were 
first applied to the factor which ap- 
peared to remain after the radical- 
conservative axis was extracted. No 
formal definition was given but com- 
parisons between the implicit content 
of the items and some of William 
James's comments about the tender- 
minded and the tough-minded have 
been made by Eysenck (9, p. 131). 
Such an identification is extremely 
tenuous. If we were to take Eysenck's 
usage of James seriously we would 
have to conclude that communists 


dimension of 


were described as forming a 


class 


447 


and fascists were not dogmatic (since 
dogmatism is a Jamesian tender- 
minded trait and according to Ey- 
senck communists and fascists are 
“‘tough-minded”’)! 

No evidence is presented in sup- 
port of Eysenck’s contention, nor 
does he indicate which of his students 
have attempted to undertake an an- 
alysis of the term ‘‘tough-minded- 
Melvin, whose scales of R 
and T have been cited by Eysenck as 
being “improved” versions (9, p. 
132), is the only student known to 
have done any other work on the 
construction of the T scale.’ This is 
his description of his approach: 


ness.” 


The most logical way to begin this search 
would be to make a theoretical analysis of the 
concept of tendermindedness and then make 
formal deductions to a series of hypotheses 
about its verbal manifestations. This proce- 
dure was considered, but it soon became clear 
that it was difficult to arrive at any conclu- 
sions about the essential psychological nature 
of T by pure thought alone, and that a strictly 
formal approach would have to be abandoned 
(15, p. 122, italics in original). 


Melvin’s basic procedure’ was 
Eysenckian. He examined attitude 
scales published since 1947 in search 
of items (Eysenck had gone over 
earlier scales). In addition, however, 
he gathered items from the expressed 
opinions of minority group members 
and publications, from a political en- 
cyclopedia, and other new items were 
originated upon the basis of discus- 
sions with Eysenck. His pool of 239 
items was an adequate sampling of 
existing scales and also contained 
original material in contradistinction 
to Eysenck’s original 40 items (4, p. 
427). 


7 Melvin's thesis (15) was mentioned in the 
critique as being unknown at the University 
of London Library. It has been filed since the 
critique was submitted for publication (per- 
sonal communication from the Univer. of 
London Library, Aug., 1955). 





448 


It is therefore not surprising that 
such an essentially empirical ap- 
proach, although conducted with 
methodological sophistication, led to 
the discovery of relatively few items 
which even hinted at the existence 
of ‘“‘pure’’ T items. Seven items were 
found which had negligible loadings 
on R (.08 or less) and modest lead- 
ings on T (.20 or more) (15, Ap- 
pendices A-D, pp. i—xxviii). 

In his concluding chapter, Melvin 
says, “The difficulty noted... of 
obtaining valid Tendermindedness 
scores for toughminded-radicals raises 
another urgent problem. This might 
well be approached along similar 
lines to those adopted by the authors 
of The Authoritarian Personality... 
in their development of the California 
F-scale”’ (15, p. 344). 

The inferences to be drawn from 
Melvin's work are: (a) his approach 
was essentially empirical and did not 
follow from any rigorous analysis of 
the definition of ‘‘tough-mindedness,”’ 
and (6b) he apparently believes that 
valid T-scale items might be obtained 
by a more theoretically oriented ap- 
proach. 

Eysenck’'s claim that he and his 
students have been attempting for 
years to do research based upon an 
analysis of the definition of T is not 
supported by any known published 
or unpublished material. Indeed, the 
most recent and most relevant thesis 
—done under Eysenck’s own super- 
vision—suggests that such an ap- 
proach be tried! 

It is a moot point whether or not 
‘“‘pure’’ T items could be uncovered. 
To repeat the point made in the 
critique, the procedures used by 
Eysenck and his students could not 
possibly have uncovered many of 
them—if they did exist—because no 
formal attempt was made to define 
what they were looking for and the 


RICHARD CHRISTIE 


selection of items was limited pri- 
marily to those used by other investi- 
gators. 

Eysenck's use of supporting re- 
search. In his replies to Rokeach and 
Hanley and to this critic, Eyvsenck 
has not chosen to answer specific 
criticisms about his methodology. In- 
stead he has preferred to rely upon 
references to unpublished theses of 
his students. 

There are three studies done by 
Eysenck and his students which have 
contained comparisons between com- 
munist, fascist, and other political 
party samples’ scores upon the T 
scale. The first of these was the 1951 
study of Eysenck (7). Flaws in this 
study were pointed out and these 
were not directly answered in Ey- 
senck’s reply. 

The second was an unpublished 
thesis by Coulter which Eysenck 
referred to in his reply to Rokeach 
and Hanley as supporting his posi- 
tion. This critic raised questions 
about Coulter's thesis which, if cor- 
rect, invalidated it as a meaningful 


comparison of the ‘‘tough-minded- 
f communists, 
major party members. Evysenck does 


ness”’ of fascists, and 


not mention, let alone 


answer these criticisms in his reply. 


attempt to 


He also does not again cite this thesis 
in support of his position in his reply. 

The third study an unpub- 
lished thesis by Nigniewitzky. In 
his reply to this Eysenck 
places primary reliance upon it and 
anticipates that it might be consid- 
ered irrelevant to criticisms based 
upon earlier 11, p. 435). 
Such an anticipation is correct. Even 
if Nigniewitzky'’s data could with- 
stand critical methodological scrutiny 
there would be no justification for the 
many errors made by Evsenck. If 
Nigniewitzky’s results are correctly 
reported by Eysenck, the latter is 


Was 


crit " 


studies 








SOME ABUSES OF PSYCHOLOGY 


placed in the position of having blun- 
dered onto the truth despite a con- 
catenation of critical mistakes. 

The temporary unavailability of 
Nigniewitzky's thesis* makes it im- 
possible to determine whether it is 
relevant to the following statement 
by Evysenck, ““Thus, an improvement 
in sampling procedures, as demanded 
[sic] by Christie, and an improve- 
ment in the scale used do not result, 
as would be predicted from his crit- 
icisms, in a lessening of the observed 
differences between communists and 
the orthodox political parties + 
(11, p. 436, his italics). 

It is unclear from Eysenck’s state- 
ments a sample was 
utilized by Nigniewitzky. According 
to Eysenck, when replying to Rokeach 


's 


what sort of 


and Hanley, it was ‘ a properly 
stratified sample of the French popu- 
10, p. 178). According 
when replying to this 

. aproperly selected 


lation... 

to Evsenck, 
writer it was “* 
representative sample of the French 


§ The o1 ly possible way oO get ¢ ypies ot 
theses from the University of London Library 
that a microfilm copy be 


is to requt st pre- 


pared The such 


must pay for it and sign a stateme! 


it be que 


person requesting a copy 


t that it 
ted without written permission 
edure was followed in 
‘sis and it required 
nquiry until 

is received 


im copy W 


1 view of the short period 
. 


for a microfilm copy of 


f time available 


ck it did not appear feasi- 


Nignie- 
is it certain that quota- 
Following Ev- 
nck’s re in Footnote 2 of his reply 
11, p. 437), the writer requested copies of 
oth Melvin’s and Nigniewitzkv's theses. A 
copy ot ISt\ r- 
warded. Evsenck said that a copy of Nignie 
t vet been received from 

the University of London Library \t the 
ime (March 7, 1956) Nigniewitzkv was 


thesis nor wv 
would be pern itted 


marks 


Melvin'’s thesis was gractously for 
witzkv's thesis had ne 
ysent a letter requesting a copy of his the- 


No re ply h is been receive d as ot the time 
this article was submitted for publication. 


449 


middle-class population” (11, p. 435, 
italics added). 

It is also unclear from what Ey- 
senck says what improvements in the 
T scale were made and what rele- 
vance these might have to the criti- 
cisms made of the earlier version. 
The only “improved” versions of the 
T scale mentioned by Eysenck in 
other contexts are those by Melvin. 
The problem of unequal distribution 
of T-scale items in the four quadrants 
also exists in Melvin’s two scales 
since they both had ten items in each 
of the two “tough-minded"’ quad- 
rants and six in each of the two 
‘“tender-minded” quadrants (15, Ap- 
pendices A—D, pp. i-xxviii). This dis- 
tribution, combined with either of 
the two scoring systems discussed by 
Melvin would not alleviate’ the 
sources of bias discussed in compar- 
ing communists with members of 
major political parties on the T 
scale. Unlike Evsenck, Melvin recog- 
nizes this problem and discusses in 
his thesis the then unsolved problem 
of communists responding to T-scale 
items in terms of their loading on R 
rather than T (15, pp. 219-225). 

Aside from a lack of satisfactory 
detail, Evsenck’s remarks about the 
yreater differences found between 
communists and major party mem- 
bers in France than in England 
evades the issue and is completely 
irrelevant. 
errors in 
studies ot 


Gross methodological 
Evsenck’s and Coulter's 
British party members 
made their comparisons meaningless. 
It therefore follows that no predic- 
tion whatsoever can be made as to 
whether or not a new study (using 
proper sampling and measurement 
procedures) would show an increase 
or lessening in relative differences of 
parties along the T scale when com- 
pared with their results. 

It would be unjust to prejudge 





450 


Nigniewitzky's thesis upon the basis 
of Eysenck’s ambiguous remarks. It 
would be unwise to accept the lat- 
ter's statements about it at face 
value, however, since Eysenck 
““.. cannot agree with any of the 
major criticisms put forward...” 
(11, p. 431) of his own work and 
nowhere indicates an awareness of 
the implications of his many method- 
ological excesses. 

Eysenck concludes his flight from 
criticism by inviting the reader to ex- 
amine five studies using the T scale 
carried out in countries other than 
Great Britain. The suggestion is 
irrelevant as far as criticism of 
Eysenck’s procedures is concerned. 
The prospective reader should be re- 
minded that with the presumed ex- 
ception of Nigniewitzky’s study, all 
of them utilized Eysenck’s original T 
scale. Their interpretation is there- 
fore subject to all the cautions neces- 
sitated in evaluating results obtained 
with this unique “measurement” in- 
strument. 

It is contended that Eysenck’s ex- 
tensive citation of research other than 
his own fails to answer the methodo- 
logical criticisms raised about his 
own work. In those instances where 
such research has been available for 
examination, it not support 
Eysenck but confirms the criticisms 
made or is irrelevant to them. 


does 


CONCLUSION 


Eysenck contends that 
nists and fascists are more “tough- 


commu- 
minded” and ‘authoritarian’ than 
are members of major political par- 
ties. This plausible assumption turns 

®* For an amusing earlier demonstration of 


the same point see the comments by Greenall 
(12) and Eysenck's reply (6). 





RICHARD CHRISTIE 


out, upon critical inspection, to be 
based upon errors of computation, 
uniquely biased samples which forbid 
any generalizations, scales with built- 
in biases which do not measure what 
they purport to measure, unexplained 
inconsistencies within the data, mis- 
interpretations and _ contradictions 
of the relevant research of others, 
and unjustifiable manipulations of 
the data. Any one of Eysenck’s many 
errors is sufficient to raise 
questions about the validity of his 
conclusions. Jn tote, absurdity is 
compounded upon absurdity, so that 
where, if anywhere, the truth lies is 
impossible to determine. 

It had been hoped that Eysenck’'s 
reply to specific criticisms would be 
directed toward acknowledging their 
relevance or rebutting them. If this 
had been done our exchange would 
have served to clarify problems and 
sharpen legitimate points of differ- 
ence. Instead, Eysenck does not rebut 
a single specific criticism. 

Eysenck's responses to these criti- 
cal points which he takes note of in- 
variably evade the specific issue. Re. 
liance is placed upon an extensive ci- 
tation of the research of others- 
Those that are available do not sup- 
port his position but indicate the 
cogency of the criticism. 

This critic rests his case. It is be- 
lieved that the detailed and perhaps 
tedious documentation of Eysenck’s 
scientific sins of omission 
mission is sufficient to raise grave 
doubts about the validity of his 
conclusions. The reader is invited to 
decide for himself whether or not 
Eysenck’s many methodological er- 
rors and his evasions of specific criti- 
cisms constitute abuses of psychol- 
ogy. 


serious 


and com- 








SOME ABUSES OF PSYCHOLOGY 451 


REFERENCES 


. Aporno, T. W., FRENKEL-BRUNSWIK, 
E.se, Levinson, D. J., & SANFORD, 
R. N. The authoritarian personality. 
New York: Harper, 1950. 

Atmonp, G. A. The appeals of commu- 
nism. Princeton: Princeton Univer. 
Press, 1954. 

. Curistie, R. Review of Eysenck’s The 

Psychology of Politics. Amer. J. Psy- 

chol., 1955, 68, 702-704. 

. Curistie, R. Eysenck's treatment of the 

personality of communists. Psychol. 

Bull., 1956, 53, 411-430 

. Eysenck, H. J. Primary social attitudes: 
1. The organization and measurement 
of social attitudes. Int. J. Opin. Atti 
tude Res., 1947, 1, 49-84. 

Eysenck, H. J. Reply. Brit. J. Psychol., 
Statist. Sec., 1949, 2, 65. 

Eysenck, H. J. Primary social attitudes 
as related to social class and political 
party. Brit. J. Sociol., 1951, 2, 198-209 

Eysenck, H. J. Uses and abuses of psy- 
chology. London: Penguin, 1953. 

Eysenck, H. J. The psychology of politics 
London: Routledge & Kegan Paul, 
1954. 

. Eysenck, H. J. The psychology of poli- 

tics: a reply. Psychol. Bull., 1956, 53, 

176-182. 


11. Eysenck, H. J. The psychology of poli- 
tics and the personality similarities be- 
tween fascists and communists. Psy- 
chol. Bull., 1956, 53, 431-438. 

. GREENALL, P. D. Twocriticisms. Brit. J. 
Psychol., Statist. Sec., 1949, 2, 64-65. 

. Hanvey, C., & Rokeacn, M. Care and 
carelessness in psychology. Psychol. 
Bull. 1956, $3, 183-186. 

. Linpeuist, E. F. Design and analysis of 
experiments in psychology and educa- 
tion. Cambridge: Houghton Mifflin, 
1953 

. Mecvry, D. An experimental and statis- 
tistical study of two primary social 
attitudes. Unpublished doctor's dis- 
sertation, Univer. of London, 1955. 

Roxeacnu, M., & Haney, C. Eysenck's 
tender-mindedness dimension: a 
critique. Psychol. Bull., 1956, 53, 169- 
176. 

. Roxeacu, M. Political and religious dog- 
matism: an alternative to the authori- 
tarian personality. Psychol. Monogr., 
1956, 70, No. 18 (Whole No. 425) 

8. StourrerR, S. A. Communism, conformity, 
and civil liberties. Garden City: Double- 
day, 1955 


Received July 8, 1956. 





PSYCHOLOGICAL BULLETIN 
Vol. 53, No. 6, 1956 





THE QUANTITATIVE STUDY OF SHAPE AND 
PATTERN PERCEPTION! 
FRED ATTNEAVE ann MALCOLM D. ARNOULT 
Skill Components Research Laboratory, Air Force Personnel and Training Research Center 


The pre-eminent importance ot 
formal or relational factors in per- 
ception has been abundantly demon- 
strated during some forty years of 
gestalt psychology. It seems extra- 
ordinary, therefore, that so little 
progress has been made (and, indeed, 
that so little effort has been ex- 
pended) toward the systematizing 
and quantifying of such factors. Our 
most precise knowledge of perception 
is in those areas which have yielded 
to psychophysical analysis (e.g., the 


perception of size, color, and pitch), ° 
but there is virtually no psycho- 
physics of shape or pattern. 

Several difficulties may be pointed 
out at once: (a) Shape is a multidi- 


mensional variable, though it is often 
carelessly referred to as a “‘dimen- 
sion,”” along with brightness, hue, 
area, and the like. (6) The number 
of dimensions necessary to describe 
a shape is not fixed or constant, but 
increases with the complexity of the 
shape. (c) Even if we know how 
many dimensions are necessary in a 
given case, the choice of particular 
descriptive terms (i.e., of reference- 
axes in the multidimensional space 
with which we are dealing) remains 
a problem; presumably some such 
terms have more psychological mean- 
ingfulness than others. 


! This research was carried out at the Skill 
Components Research Laboratory, Air Force 
Personnel and Training Research Center, 
Lackland Air Force Base, San Antonio, 
Texas, in support of Project 7706, Task 27001. 
Permission is granted for reproduction, trans- 
lation, publication, use, and disposal in whole 
or in part by or for the United States Govern- 
ment. 


452 


The need for an adequate psycho- 
physical framework is most obvious 
in those studies (having to do with 
discrimination, for example, or with 
positive or negative transfer) in 
which it is necessary to manipulate 
shape or pattern as an independent 
variable. Unless some meaningful 
units of variation specifiable, 
functional relationships cannot be 
obtained. It is somewhat less obvi- 
ous, but nonetheless true, that a com- 
parable need exists in experiments 
which seek to determine how form 
perception is influenced by extrinsic 
variables such as contrast, 
method and degree of familiarization, 
etc. In studies of this sort, the ex- 
perimenter commonly usessome small, 
arbitrarily stimuli: 
sometimes simple geometrical forms; 
sometimes a group of 
shapes which he draws in a more or 
less haphazard manner. If the results 
obtained the 
usual sense, we have some specifiable 


are 


size, 


chosen set of 


‘‘nonsense”’ 


are “significant” in 
degree of confidence that they are gen- 
eralizable to people other than those 
used as subjects, but the degree to 
which they are generalizable to new 
stimuli remains a matter of 
ture. Yet the latter kind of generaliz- 
ation is no less important than the 
former. Only in rare cases of applied 
research is the really 
content with results which hold only 
for the particular stimulus objects 
emploved experimentally. 

Egon Brunswik (9, 10, 11) is per- 
haps the only psychologist who has 
ever given due weight to the im- 
portance of stimulus-sampling, or of 


COnTECC- 


investigator 








STUDY OF SHAPE AND PATTERN PERCEPTION 


situation-sampling in general. Al- 
though the approach of this paper 
is somewhat different from Bruns- 
wik’s, for reasons which are devel- 
oped below, we wish to acknowledge 
freely Brunswik’s influence upon our 
own thinking, and to commend his 
writings on this subject to any reader 
unacquainted with them. Brunswik 
takes the reasonable position that re- 
sults with “‘ecological validity’ may 
be obtained only by the use of experi- 
mental materials which are drawn 
from, and hence representative of, 
the real situations to which one 
wishes to generalize. Thus, in the 
study of shape perception, it would 
be desirable to experiment with the 
shapes of natural objects. Suppose, 
however, that we wish to investigate 
the learning and memory of shapes 
with which subjects are initially “un- 
familiar: the requirement of unfamili- 
arity will obviously preclude the ex- 
perimental use of shapes which are 


commonly encountered. Is there any 
sensible procedure for choosing stim- 
ulus-materials in this sort of situa- 
tion? 

It is our belief, at this time, that 
the problem of generalizing from ex- 


perimental stimuli may _ profitably 
be broken into two parts. First, 
there is the problem of specifying the 
stimulus-domain, i.e., the problem 
of drawing a sample of stimuli from 
a parent population characterized by 
certain determinate statistical pa- 
rameters. The stimulus-domain, or 
parent population, includes all those 
stimuli to which the results may be 
generalized, and is defined by the sta- 
tistical parameters which characterize 
it. In the following section we shall 
indicate a variety of particular meth- 
ods for drawing ‘‘random” patterns 
and shapes from such clearly defined 
hypothetical populations, to which 
experimental results may then be gen- 


453 


eralized with measurable confidence. 

The second problem, which is 
really a special case of the first, is 
that of drawing a sample which has 
“ecological validity.”’ If our real aim 
is to generalize to natural forms, or 
to some subset thereof, it is: neces- 
sary to estimate the psychologically 
important statistical parameters of 
these natural forms in order that ex- 
perimental materials may be con- 
structed to possess the same parame- 
ters. Thus, we are brought back to 
the acute need for a general psycho- 
physics of form. In the final section 
we shall discuss the kinds of physical 
analysis and measurement which ap- 
pear appropriate to such a psycho- 
physics. 


THE CONSTRUCTION OF STIMULI 


All the methods described below 
for constructing nonsense shapes and 
patterns have in common the fact 
that the particular characteristics of 
each figure are randomly determined. 
Each method is, in effect, a set of 
rules by which points are plotted and 
connected in accordance with values 
obtained from a table of random 
numbers. Each method, or set of 
rules, thus determines a domain of 
stimuli. The stimuli actually con- 
structed for use in a given experiment 
will, if they are all constructed ac- 
cording to the same rules, be a ran- 
dom sample of the stimulus-domain 
defined by the set of rules. The ex- 
perimental results, consequently, may 
be generalized both to the entire 
stimulus-domain and to the appropri- 
ate subject population.’ 

* The kind of double-generalization pro- 
posed here would require an error term which 
included the variance due to subjects, the 
variance due to stimuli, and the interaction 
between them. In what is perhaps the most 
obvious analysis-of-variance design, the sub- 
jects Xstimult X treatments mean square would 
be the appropriate error term to use. 





454 FRED ATTNEAVE AND 

The experimenter who desires to 
use stimuli constructed in this man- 
ner must determine what set of rules 
will provide him with a stimulus 
population having the character- 
istics he wants. If one desires to gen- 
eralize experimental results to the 
world of real objects .(chairs, air- 
planes, people, etc.), it is necessary 
to have a stimulus sample possessing 
ecological validity. To construct 
nonsense stimuli of this sort one 
must know the pertinent parameters 
of the stimulus-domain of real ob- 
jects and use these parameters in 
constructing the experimental stim- 
uli. In the next section we shall dis- 
cuss some of the problems inherent 
in this methodological requirement 
and some of the attempts which have 
been made to solve them. 

In the present section, some gen- 
eral methods for constructing stimuli 
are described in sufficient detail that 
the reader, if he desires, may repeat 
the operations in order to develop 
additional stimuli belonging to the 
various stimulus-domains defined by 
the methods. It should be kept in 
mind, however, that these methods 
are described merely as examples 
and are not intended to constitute a 
comprehensive catalog of all possible 
methods. Descriptions will be given 
of methods for generating shapes 
having either closed or open contours, 


100 100 


80 60 
60 


40 


20 








J A i 





MALCOLM D. ARNOULT 


for generating various kinds of pat- 
terns, and for introducing systematic 
Variations or transformations 
shapes or patterns. 


of 


Closed Contours—Angular Shapes 


Method 1. Starting with a sheet of 
graph paper—say 100 X 100—succes- 
sive pairs of numbers between 1 and 
100 are selected from a table of ran- 
dom numbers. Each pair will deter- 
mine a point which can be plotted on 
the 100X100 matrix. The total num- 
ber of such points to be plotted can 
be determined either randomly or 
arbitrarily. 

When all the points have been 
plotted, a straightedge is used to 
connect the most peripheral points 
in such a way as to form a polygon 
having only convex angles. This 
operation will usually leave some un- 
connected points within the polygon 
(Fig. la). When a point falls within 
some small, arbitrarily chosen dis- 
tance of the proper perimeter (e.g., 
the point between segments 7 and 8 
in Fig. 1a) it is included even though 
it makes a slightly concave angle, 
since otherwise an indentation prac- 
tically dividing the shape into two 
parts might later occur. The sides of 
the polygon are numbered, and the 
points remaining inside are assigned 
letters. The table of random numbers 
is then used to determine which of the 


60 
60 
40 


20 





i 











60 8 wo °° 


20 40 #60 680 


we «CP 


Fic. 1. SuccEsSIVE STAGES IN THE CONSTRUCTION OF A “RANDOM” FIGURE 
AccorDING To MeEtnop 1 (See TEXT) 








STUDY OF SHAPE AND PATTERN PERCEPTION 


central points is connected to which 
side. In the example given, Point C 
was connected to Side 2, forming in 
the process Side 10 (Fig. 10). At this 
stage in the construction, the possi- 
bilities of connecting points have 
been changed. Point A may now be 
taken into Sides 3, 4, 5, 6, 7, 8, or 10, 
but not into Sides 1, 2, or 9. Point B 
may be connected only to Side 2 or 
Side 10. If Point A is connected to 
Side 5, forming new Side 11, there 
remains only the possibility of con- 
necting Point B to Side 2 or Side 10 
(see Fig. 16). Connecting Point B 
to Side 10 completes the shape, which 
finally appears as shown in Fig. 1c. 
It will be noted that every step 
in the procedure is determined either 
randomly or by the elimination of all 
other possibilities. Furthermore, 
every step is completely determinate 
and can be duplicated by anyone us- 
ing the same rules and the same selec- 
tions from the table of random num- 
bers. 
Method 2. 
structing 


This method of con- 
random shapes is also 


started by plotting successive pairs 
of random numbers as coordinates on 
graph paper. As each point is plotted 


100 


5 


80 


60 


40 








° 80 


455 


it is given a number so that even- 
tually all are numbered serially. 
These points are then connected in 
the order in which their serial num- 
bers first appear in a table of random 
numbers, except that numbers which 
violate certain rules of construction 
are rejected. The incomplete con- 
struction shown in Fig. 2a will pro- 
vide examples of permitted and non- 
permitted connections. The rules for 
connecting points are as follows: 

a. No line may be drawn twice. 
Assume, in Fig. 2a, that the last line 
drawn was from Point 2 to Point 5. 
If the next number in the table were 
2, it would be rejected since that con- 
nection has already been made. 

b. No line may be drawn which 
completely encloses a point within 
the perimeter of the figure. From 
Point 5 it would not be permissible to 
draw a line to Point 6 or to Point 4, 
since either action would completely 
enclose Points 3 and 8. 

c. No two points may be directly 
connected if they are already con- 
nected by a path which follows per- 
imeter lines without passing through 
any other plotted points. For ex- 
ample, Point 5 may not be connected 


100 
80 


60 


40 





1 1 L. 1 


0 20 40 60 80 





Fic. 2. EXAMPLE OF NONSENSE SHAPE CONSTRUCTED BY THE RULES OF METHOD 2: 


INCOMPLETE 


CONSTRUCTION DEMONSTRATING 


PERMISSIBLE AND NONPERMISSIBLE 


Connections; 6. THE CompLETED SHAPE 





456 


to Points 3 or 7, Point 3 may not be 
connected to Points 5 or 6, and Point 
2 may not be connected to Point 4. 

d. The figure is complete when 
each point has been connected to at 
least two other points. It sometimes 
happens that the table of random 
numbers leads one to a point which 
already has all the other connections 
allowed it. In this case one of the 
other points is chosen randomly as a 
new origin and the regular process is 
continued. The incomplete shape of 
Fig. 2a is shown in a completed form 
in Fig. 20. 

As is the case with all the methods 
described in this paper, this method 
is completely objective. The result- 
ing figure could be reproduced, if 
necessary, from a set of coded instruc- 
tions consisting only of the numbers 
originally selected from the table. 

Unlike Method 1, Method 2 usu- 
ally generates shapes containing some 
angles in addition to all those at 
originally plotted points. This dif- 
ference is emphasized by Rule ¢ of 
Method 2. 

In Method 1 there are no restric- 
tions on the ways in which the plotted 
points may be connected except that 
(a) the figure must be closed, and (0) 
connecting lines may not cross, i.e., 
the completed figure may have angles 
only at the original points. 

In Method 2, on the other hand, 
there may be “emergent” angles at 
places other than originally plotted 
points, and the figures produced tend 
to be characterized by “good continu- 
ation.”’ Again, it is Rule c of Method 
2 which causes many of the perimeter 
lines of the final figure to be continua- 
tions of other perimeter lines. 

Comparing the two methods in 
terms of the informational content 
of the shapes produced shows that 
in Method 1 information (in addition 
to that required to locate the original 





FRED ATTNEAVE AND MALCOLM D. ARNOULT 


points) is used only in connecting the 
interior points to the sides of the 
original perimeter, whereas in Method 
2 information is used in making all 
connections between plotted points. 
For this reason a Method 2 shape 
composed of m original points and 
containing +k angles (& represent- 
ing the number of ‘‘emergent”’ points) 
will contain more information than a 
Method 1 shape composed of n origi- 
nal (and final) points. Because of the 
good continuation introduced into 
the figure, however, the Method 2 
shape having n+ points will contain 
less information than would a Method 
1 shape having +8 original points. 

Method 3. Fitts, Weinstein, Rap- 
paport, Anderson, and Leonard (15) 
have developed a technique for con- 
structing ‘“‘metric’’ figures, the in- 
formational content of which may be 
easily and accurately determined. 
Starting with a somewhat smaller 
matrix—say, 8xX8—the number of 


cells to be filled (from the bottom up) 
in each column of the matrix is ran- 
domly determined. This method pro- 
duces shapes which belong to a rela- 


tively small stimulus-domain and 
which are equal in informational con- 
tent. A variation of this method in- 
volves allowing each possible column- 
height to appear only once in each 
shape, with the order of appearance 
determined randomly. 
stimulus-domain contains members 
which are equal in area and, conse- 
quently, contain information 
than the shapes first described. Still 
another variation may be introduced 
by reflecting each shape on one of its 
axes to produce a symmetrical shape 
containing no more information than 
its nonsymmetrical predecessor. Ex- 
amples of these various classes of 
metric figures may be found in Refer- 
ence 15. 


This second 


less 








STUDY OF SHAPE AND PATTERN PERCEPTION 





L 


Fic. 3. 
SHAPE. 


Closed Contours—Curved Shapes 


Method 4. This method describes 
a procedure for making wholly or 
partially curved from the 
angular shapes constructed = by 
Method 1 or 2. This procedure may 
appear to be somewhat involved, but 
actually it requires more time to de- 
scribe than to perform. Essentially, 
it consists merely of replacing angles 
with inscribed arcs, of curvature 
chosen randomly within fimits im- 
posed by the figure. 

For purposes of demonstrating the 


shapes 





METHOD FOR INTRODUCING “RANDOM"™ CURVES INTO AN ANGULAR NONSENSE 
THE ORIGINAL SHAPE IS THE SAME ONE WHICH APPEARED IN FIG. Ic 


method, let us start with the shape 
described and constructed under 
Method 1 (Figs. la—1c). It is de- 
cided (arbitrarily or randomly) that 
four of the twelve angles are to be 
curved. Let us suppose that Angles 
C, F, J, and K (Fig. 3) are chosen. 
(For convenience of exposition the 
angles have been assigned the letters 
A through L.) The first step in the 


process consists in constructing line 
Cp, which is the bisector of Z BCD. 
Then, the shorter of the two arms of 
the angle (in this case, line BC) is 





458 FRED ATTNEAVE AND 


divided into equal units. These units 
may be chosen for convenience. For 
example, Fig. 3 was constructed on a 
100 X 100 matrix having matrix units 
equal to 0.20 in., and Line BC was 
arbitrarily divided into segments of 
0.25 in. each. It should be noted that 
the divisions of the line are num- 
bered in sequence, starting always 
from the apex of the angle. 

One of these numbered points on 
line BC is now chosen at random and 
a perpendicular from Line Cp to it 
is constructed (Line 5-q). This line 
(5-g) now becomes the radius of an 
are which is inscribed within Z BCD. 
The are is tangent to Line BC and 
Line CD at points equidistant from 
C. Thus, ZBCD has now been re- 
placed by a curve (actually, two lin- 
ear segments and an arc) going from 
B to D. 

Point F has been curved by the 
same process. Angle EFG is bisected 
by line Fr, and Line FG is divided 
into equal segments. Division 8 hav- 
ing been chosen at random, line 8-s 
is constructed and used as a radius 
for inscribing a curve within Z EFG. 

The next two constructions dem- 
onstrate the complex curvature which 
may result when successive points 
are chosen to be curved. Point J 
is curved by the process described 
above, with Line 13-u being used as 
the radius of an arc inscribed within 
ZIJK. However, in curving Point 
K it is necessary to inscribe an arc 
within ZJ’'KL, not within ZJKL. 
Point J’ is the point at which the arc 
constructed with radius 13-u_ be- 
comes tangent to line JK. 

If it is so desired, all the points of 
an angular figure may be curved. It 
should be noted, however, that the 
shorter arm of every angle is divided 
into segments, and that its divisions 
are numbered beginning with zero. If 
the zero is the random choice, the re- 





MALCOLM D. ARNOULT 


sulting curve will have zero radius, 
i.e., that angle remains as originally 
drawn. 

Method 5. Angular shapes can be 
changed into curved shapes by a pro- 
cess of photographic blurring. The 
figure is first photographed and then, 
with the help of an enlarger, is 
printed out-of-focus on high contrast 
paper. The resulting image has a con- 
tour which is curved, but which is 
also graded in density. A repetition 
of the process of photographing and 
printing, however, will eliminate the 
density gradient, producing a shape 
with contours which are rounded and 
well-defined. The amount of blur 
may, of course, be carefully con- 
trolled, and a graded series of curved 
shapes may be made from a single 
prototype shape. 


Open Contours 


Method 6. There are many ways in 
which open-contour nonsense shapes 
may be constructed from a table of 
random numbers, but all that we 
have used have been variations on 
one basis method. Starting from the 
approximate center of a matrix ol 
convenient size, a line is drawn to one 
of the eight intersections nearest the 
starting point. These eight intersec- 
tions (or, more generally, directions) 
have been assigned numbers as shown 
in Fig. 4a. The intersection on the 
graph paper at which the first line 
terminates becomes the origin for 
the second line to be drawn, and so 
on. A difficulty with this method is 
that there is no intrinsic criterion for 
completeness in such a figure. One 
objective rule is to determine, before 
beginning the copstruction, the total 
number of digits to be selected from 
the table and to consider the figure 
complete when that number of lines 
has been drawn. 

Many variations on this basic tech- 








STUDY OF SHAPE AND PATTERN PERCEPTION 


2 


* 


r 
le U— 


9 


A 
\ 

ye 
4 


K 
o 


~e —3x — pew 
4 


a) (b) 


Fic. 4. CONSTRUCTION OF OpEN-CONTOUR 
“Ranpom" SHAPE: a NUMBERING OF POSSIBLE 
INTERSECTIONS; 6. TypicaL NONSENSE SHAPE 


nique may be introduced. For ex- 
ample, for some purposes it may be 
desired to allow only four directions 
in which the contour may vary; also, 
the length of each line may be deter- 
mined randomly as well as the direc- 
tion. Partially or wholly curved con- 
tours may be produced by this 
method as follows: the radius of cur- 
vature of the arc drawn to connect 
along the 
horizontal and vertical axes of the 
matrix is set as one-half the length 
of a matrix unit. To connect two 
intersections diagonally separated, 
the are would have a radius equal to 
one matrix unit. Thus, for example, 
one might determine randomly for 
each line constructed: (a) which two 
intersections will be connected, (0) 
whether the connection is to be linear 
or curved, and (c) the direction of 
curvature. Figure 46 was drawn by 
this technique. Additional variations 
on these methods may be provided 
by using semi-log, log-log, or polar 
coordinate matrices on which to con- 
struct the nonsense contours. 


successive intersections 


Patterns 
Method 7. Although the more obvi- 
ous ways of generating randdm pat- 


terns have been used by a number of 
investigators, the possibilities of this 


459 


approach to the construction of com- 
plex visual displays have never been 
adequately explored. In general the 
practice has been to construct a 
matrix of some given size and then to 
determine randomly which cells are 
to be filled. Patterns of dots were 
constructed in this fashion by Kauf- 
mann, et al., (19), French (16), and 
Klemmer and Frick (20), for exam- 
ple. Attneave used the same ap- 
proach, including the introduction of 
a symmetry factor, in a study of the 
effect of redundancy on memory for 
patterns (4). In another slight varia- 
tion Arnoult used random shapes as 
elements in constructing random pat- 
terns for use in a learning experiment 
(2). Patterns generated in this fash- 
ion are very attractive as stimuli be- 
cause it is usually possible to com- 
pute fairly precisely the imforma- 
tional content of the display. 


Systematic Variations 


Frequently it is desired to con- 
struct ‘‘families’’ of shapes having 
known physical relationships among 
the individual members. Again, there 
are many possible techniques for ac- 
complishing this end. The following 
two methods represent two kinds of 
systematic variations which have re- 
cently been used. 

Method 8. A_ prototype shape is 
constructed by any of the methods 
so far described. Then, each point 
is moved to a new location and the 
connecting lines redrawn as before. 
In moving the points, any of the fol- 
lowing parameters may either be 
held constant or varied randomly: 
(a) the number of points moved, (0) 
the particular points moved in mak- 
ing successive variations on the same 
prototype, (c) the distance through 
which a point is moved, and (d) the 
direction of movement. A number of 
variations made from a given proto- 





460 


type will form a distribution of shapes 
which ‘‘vary about” the prototype. 
Stimuli of this sort were used re- 
cently by Attneave in testing the 
hypothesis that knowledge of the 
prototype shape, or “‘schema,"’ would 
facilitate discrimination of the varia- 
tions in paired-associate learning (6), 
and by Arnoult in a study of the ef- 
fect of predifferentiation training 
on recognition (1). A typical proto- 
type shape and its variations are 


shown in Fig. 5. 


4 2 


4 


4 


> 
42 


A PrototyPE SHAPE AND “Famity” 
OF RANDOM VARIATIONS 


Fic. 5. 


Method 9. A somewhat different 
technique for creating ‘“‘families’’ of 
shapes has been developed at Stan- 
ford by LaBerge and Lawrence (23). 
Initially, a random shape is con- 
structed by a method essentially 
the same those described in 
Method 1 and Method 2 (actually, 
LaBerge and Lawrence simply con- 
nected randomly chosen points into 
the polygon of minimal perimeter). 
Then, each point on the contour is 
assigned randomly chosen “x” and 
“y'’ increments to its coordinates, 
and these new coordinates are plotted 
and connected on a fresh matrix. 
These same increments are then 


as 


FRED ATTNEAVE AND MALCOLM D. ARNOULT 


added to the new coordinates and a 
third figure is constructed. This pro- 
cess may be continued until one has 
constructed a row of, say, six figures, 
each differing from its immediate 
neighbors by a constant amount of 
distortion as measured by the dis- 
tance through which the points move. 
The next step is to label the former 
“y"’ increments the 
former “y” —_ 
These new increments are added to 
the coordinates of the points of all six 
of the figures already constructed, 
and the process of constructing suc- 


and 
increments as 


as “ya” 


cessive shapes is repeated until there 
is a column of six shapes for each of 
the original six shapes. The final re- 
sult is a matrix of 36 shapes in which 
any two adjacent shapes in a row or 
column are equally spaced in terms 
of the average distance the points 
have moved. Matrices of stimuli 
of this sort are currently being used 
by LaBerge and Lawrence in studies 
of transfer. 

As has been emphasized a number 
of times in the preceding discussion, 
these methods for constructing ‘“‘ran- 
dom” shapes are only a few which 
have been selected to show some of 
the classes of shapes which can be 
constructed. 
ent be de- 
veloped for plotting and connecting 
points taken from a table of random 
numbers is limited only by the fer- 
tility of the individual experimenter’s 
imagination. It should be reiterated, 
however, that stimuli 
structed by these “random” methods 
does not insure that the generaliza- 
tions resulting from the research will 
be pertinent to all other kinds of 
visual stimuli. It guarantees only 
that the results will be generaliz- 
able within a particular stimulus- 
domain, i.e., to any other stimuli con- 
structed by the same rules. 


The number of differ- 
sets of rules which can 


using con- 








STUDY OF SHAPE AND PATTERN PERCEPTION 


ANALYSIS OF NATURAL FORMS 


Let us now return to a problem 
which the methods discussed in the 
previous section by no means obvi- 
ate. We still need a technique, or a 
set of techniques, by means of which 
physical measurements of a psycho- 
logically relevant sort may be ob- 
tained for forms which we have not 
constructed ourselves. Any method 
of “random” construction must em- 
ploy some set of rules, either arbi- 
trary or otherwise, and these rules 
will strictly determine the class-char- 

“acteristics, or statistical parameters, 
of the constructed. We 
should like to be able to devise rules 
such that our synthetic shapes might 
possess the statistical characteristics 
(but not the familiarity) of natural 
shapes to which we wish to gen- 
eralize. At present, we lack not only 
a factual knowledge of the values of 
‘hese statistical parameters, but also 
a methodology to guide us in their 
determination. Likewise, when some 
experimental variation of form is 
found to produce a certain effect in 
the laboratory, it is necessary that 
the variable in question be identifia- 
ble and measurable outside the labo- 
ratory if the results are to be gen- 
eralized. Unfortunately, however, it 
is much harder to measure form than 
to manipulate it. 

Relatively few scientists have seri- 
ously applied themselves to the prob- 
lems of analvzing and 
form; these problems seem to have 
fallen into the cracks between 
ences, and no general quantitative 
morphonomy has ever developed. 
D'Arcy Thompson's Growth and Form 
(27) is virtually the only major work 
in the field: it is a fascinating and 
impressive book, but its contribution 
to the identification of psychophysi- 
cal variables is limited. Rashevsky, 
whose work in mathematical bio- 


shapes 


describing 


sci- 


461 


physics is in some respects a continu- 
ation of Thompson's, has been more 
directly concerned with psychologi- 
cally relevant measures of form. 
Abstraction of contour. Considering 
that the first step in the analysis of a 
shape is the abstraction of its con- 
tour, Rashevsky (25, p. 449) devised 
a simple hypothetical nerve-net with 
this function. Suppose that the stim- 
ulation of the retina is projected to 
some central area as an activity of 
sharply localized excitatory fibers 
and of inhibitory fibers slightly more 
diffuse in their projection. If certain 
constants of the system have proper 
values, excitation from any area of 
uniform brightness will be sup- 


pressed, except at a contour where 
such an area is bounded by a darker 
one which provides less inhibition. 


This nerve-net has a fairly close 
analogue in the following photo- 
graphic process. A negative and a 
positive transparency, separated by 
a thin plastic sheet, are precisely 
superimposed so that they “cancel” 
each other when viewed from a right 
angle. A print is made by transmit- 
ting light from a diffuse source (e.g., 
the ground glass of a contact printer) 
through the superimposed positive 
and negative to a high-contrast paper 
placed in contact with the negative. 
In the case of a black object on a 
white ground, or vice versa, light 
can angle through both positive and 
negative only at the contour, and 
the resulting print is indistinguish- 
able from an outline drawing of the 
In the case of more complex 
pictures, the abstraction of sharp 
brightness-gradients preserves tex- 
ture, as well as contour: this is illus- 
trated clearly in Fig. 6. A picture ob- 
tained in this way may be thought of 
as a differential (with respect to 
brightness) of the original, involving 
a “delta” of finite magnitude. If 


object. 








In the 


original, the lightest portions are Thompson's forehead and beard, and the darkest portion is 


Fic. 6. A DIFFERENTIAL PICTURE 
the back of his coat. These have approximately equal brightness in the differential picture. 





FRED ATTNEAVE AND MALCOLM D. ARNOULT 


BER 4g; 


fey -4°** Ys 


es 
HA 
c 
‘ 
a 
a 
ea) 
& 
2 
ome 
ee 
Z 
w 
hg 
s 
2 
= 
r=) 
s 
2 
s 
- 
c 
iS 
- 
_ 
Cc 
© 
L 
a. 
S 
c 
S 
co 
- 
> 
c 
be 
< 
~ 
_ 
“s 
= 
a. 
oS 
oe 
g 
P 
a. 
= 
- 


appeared originally in Jsis and was reproduced in the August, 1952, Scientific American. 








STUDY OF SHAPE AND PATTERN PERCEPTION 


a smaller ‘‘delta’’ had been taken in 
the derivation of Fig. 6 (by reducing 
the space between the superimposed 
positive and negative), the iris and 
pupil of Thompson's eye, for ex- 
ample, would appear in outline in- 
stead of as a black dot. 

In 1948 one of the authors (Att- 
neave), in collaboration with John 
M. Stroud, attempted to develop 
this photographic technique to a de- 
gree of precision such that the total 
reflectance of the differential picture 
might serve as an index of the com- 
plexity of the original. That attempt 
was unsuccessful for several reasons, 
having to do chiefly with the unrelia- 
bility of photographic operations: 
e.g., the initial step of making a posi- 
tive and a negative which would ade- 
quately cancel always required con- 
siderable cut-and-try. It may be 


added that the process is a close rela- 


tive of one which has long been used 
to produce a “‘bas-relief"’ effect, and 
that the Eastman Laboratories have 
recently emploved a similar tech- 
nique with color film to obtain photo- 
graphs which look remarkably like 
paintings. 

An lately de- 
veloped by Kovasznay and Joseph 
at the National Bureau of Standards 
appears to accomplish much the same 
result as the photographic process 
described above, but in a manner sub- 
ject to more precise control. The 
beam of a cathode ray tube, moving 
in a complex scan which covers the 
field in two orthogonal dimensions, 
transmits light through a_ photo- 
graphic transparency to a_photo- 
electric cell. The electrical signal thus 
generated isdifferentiated and squared 
electronically, and then fed into a re- 
ceiving scope where it modulates a 
beam synchronized with the trans- 
mitting beam. Illustrations of the 
results, which are presented in the 


electronic device 


463 


descriptive note of Kovasznay and 
Joseph (21), could be mistaken for 
the efforts of a somewhat naive artist. 

A group of engineers in the Lin- 
coln Laboratory of M.I.T., including 
Oliver G. Selfridge, Gerald P. Din- 
neen, and Marshall Freimer, are cur- 
rently experimenting with the use of 
digital computers to perform opera- 
tions relevant to object identifica- 
tion. They have been successful in 
programming a contour-abstracting 
operation; this is preceded by an av- 
eraging operation, which rids the fig- 
ure of irrelevant detail, and followed 
by an operation which abstracts 
angles, or regions of high curvature, 
from the contour (26). 

The mere abstraction of contour, 
whether by an objective process or 
with the aid of the experimenter’s 
own perceptual machinery, does not 
in itself constitute quantification. It 
does, however, contribute to the iso- 
lation of that which is to be quanti- 
fied: i.e., form. Whenever we speak 
of form, we are referring to a some- 
what vague set of properties which 
are invariant under transformations 
of color and brightness, size, place, 
and orientation; our definition may 
or may not be extended to specify 
invariance under projective (or per- 
spective) transformations. Contour 
is characterized by invariance under 
color and brightness transformations. 
Attneave (3) has previously pointed 
out the related (though not equiva- 
lent) fact that contours are regions 
of relatively high informational con- 
tent. 

Analysis of contour. There are vari- 
ous practical reasons for wishing to 
be able to describe a contour in terms 
which are independent of its size, 
place, and orientation. For example, 
subjects are often required to draw 
figures from memory: such drawings 
cannot be fairly evaluated by any 





464 


simple method of superimposing a 
drawing upon the original and meas- 
uring deviations, because of differ- 
ences in scale, etc. If both the origi- 
nal and the reproduction could be 
represented in terms descriptive of 
form alone, they could then be com- 
pared objectively. 

Such a representation may take the 
form of a single function. If the re- 
ciprocal of the radius of curvature of 
a closed contour is plotted against 
distance along the contour, a peri- 
odic function results. This function 
may be normalized (i.e., rendered in- 
dependent of the scale of the original 
figure) by assigning a value of unity 
to the perimeter of the figure and ex- 
pressing radius of curvature in com- 
parable terms, or by setting equal to 
unity the area under one period of 
the function. An angle is represented 
by a vertical line which rises (or falls, 
in the case of a concave angle) to in- 
finity; a spike of this sort, of infinite 
height, infinitesimal width, and de- 
terminate area, is the so-called 6- 
function of Dirac, and is amenable to 
mathematical treatment.® 

If one feels more comfortable deal- 
ing with finite ordinates, the follow- 
ing system may be used. Imagine a 
miniature tricycle, guided over a 
course such that a point midway be- 
tween the rear wheels precisely fol- 
lows the contour. The angle @ by 
which the front wheel deviates from 
a forward position may be plotted 
against distance travelled by the 
front wheel to give a periodic func- 
tion descriptive of the contour. The 
front wheel will move in an arc con- 
centric with the segment of the con- 
tour being followed. Wherever an 
angle occurs in the contour, the angle 


* This system of representation has been de- 
veloped in considerable detail by Oliver 
Strauss (personal communication). 


FRED ATTNEAVE AND MALCOLM D. ARNOULT 


6 of the front wheel will be 90°; thus 
the function will always have some 
value between plus and minus 90°. 
Radius of curvature, 7, is related to 
6 by the equation r=L cot @, in 
which L is the distance between the 
front and rear wheels. Normalizing 
may be accomplished by giving the 
perimeter of the figure unit value, and 
setting ZL at some standard fractional 
value. If LZ is made to equal 1/27, 
regular polygons will be represented 
by square waves regularly al* rnat- 
ing between 0 and 90°, a circle will 
become a horizontal line with an 
ordinate of 45°, and certain other 
regularities will be uniquely repre- 
sented; this value of L is somewhat 
large for convenient use with more 
complex shapes, however. The in- 
terested reader will have little diff- 
culty in working out further details 
of the system. It has the advantage 
of specifying an actual measuring de- 
vice which is practical and simple to 
construct. Automatic recording of 
the function could be arranged with 
two pairs of selsyns: one translating 
the rotation of the front wheel into a 
movement of the recording paper; 
the other coupling the angular posi- 
tion of the front wheel with the posi- 
tion of a recording pen. 

Both of the functions just de- 
scribed have a serious disadvantage. 
Suppose we wish to compare two 
shapes which have a part-to-part or 
part-to-whole similarity—say, the 
outline of a cow’s head with the out- 
line of a whole cow. The normalizing 
factors which will be employed on a 
basis of perimeter or area will obvi- 
ously not be such as to give compara- 
ble representation to the similar por- 
tions of the outlines. 

The method next to be described 
avoids this difficulty, though it is not 
without limitations of its own. In- 
stead of describing the contour by 








STUDY OF SHAPE AND PATTERN PERCEPTION 


means of a continuous function, we 
may attempt to analyze it into parts 
which are individually homogeneous, 
and hence amenable to approximate 
description in terms of a few stand- 
ardized dimensions. It is usually 


possible to construct a polygon about 


a figure made up of complex lines 
and curves, as in Fig. 7, by drawing 
tangents (a) at points of zero curva- 
ture (e.g., CD, IJ, etc.: whenever a 
curve changes from concave to con- 
vex, it must have an intermediate 
point of zero curvature), (6) at points 
of minimal curvature, where a de- 
crease in curvature is followed by 
an increase (e.g., FG), and (c) at dis- 
continuities of slope, or angles (e.g., 
AB, GH, etc.). 
thus formed may be described simply 
by stating the slope and length of 


The series of lines 


each line in succession, but this de- 
scription is peculiar to a given orien- 
tation and size of the figure. It may 
be rendered orientation-free and scale- 
free by specifying instead, for each 
pair of adjacent segments, (a) the 
change in direction (im degrees), and 


(b) the change in the logarithm of 


length, as the contour is followed ina 
clockwise direction.* Curves are 
treated as ‘‘rounded-off"’ angles: i.e., 
a curve is approximated by an arc 
located 
lines of the polygon we have been 


tangent to two successive 


discussing. In most cases, the size of 


the arc will be limited by the length 


* Several other possible pairs of coordinates 
information. What is re- 
quired, essentially, is to describe the shapes 
of successive s¢ ginents of the polygon, taken 
in pairs. Measures of any two angles of such 
a triangle, or any two ratios of sides or differ- 
ences between logarithms of sides, or any 
combination of an angle and a comparison of 
sides, is adequate to specify the shape of the 
triangle. The combination above is chosen 
for its intuitive appeal; also because errors of 
measurement have a more uniform effect on 
these coordinates than on certain others. 


convey the same 


465 


of the shorter of the two segments. 
Hence curvature is conveniently ex- 
pressed by a third coordinate speci- 
fying (c) the proportion of the distance 
between the apex of the angle and the 
end of the shorter segment at which the 
arc best approximating the curve 1s 
tangent. This coordinate will usually 
have some value between 0 and 1.0, 
with 0 indicating an abrupt angle 
(radius of curvature equal to zero) 
and 1.0 indicating an arc which is 
tangent to the shorter segment 
at its end. In the case of Fig. 7, 
for example, (c) would have a 
value of 0 at A and MM, a value of 
1.0 at G, and a value of about .8 at C 


“4 


Fic. 7. ILLUSTRATION OF METHOD FOR 
QUANTIZING IRREGULAR CONTOUR 


(note that the arc best approximat- 
ing a curve will not necessarily have 
the same point of tangency as the 
original curve). When the are ap- 
proximating a curve turns through 
more than 180°, as in the case of the 
bulbous projection in the JAL re- 
gion, the value of (c) will not remain 
between 0 and 1, since some of the 
points of tangency are on extensions 





466 FRED ATTNEAVE AND 
of segments of the polygon, rather 
than on the segments themselves. 
The values of (c) associated with J, 
K, and L would be about 3.8, 3.6, 
and .5, respectively.® 

The reader will recognize this sys- 
tem of analysis as essentially the re- 
versal of a method for constructing 
“random” shapes which was de- 


’ The two sets of numbers below, which are 
presented as a demonstration of the practica- 
bility of the system and as an amusement for 
the reader, describe recognizable profiles of 
the two authors. Successive tri-coordinates 
are given, thus: a, bi, 1; @2, b2, C2; etc. In the 
actual reconstruction of a contour from such 
coordinates, a line of any desired length and 
slope is drawn to start. The first triad of 
coordinates gives the relationship of the 
second line of the contour to the arbitrarily 
drawn starting line, and so on. Cumulative 
error will be avoided if the values of a are 
cumulatively added to the slope, in degrees, of 
the starting line (with due regard for the circu- 
larity of the scale) to obtain the slope of each 
segment; likewise if the values of b are cumu- 
latively added to the logis length of the start- 
ing line to obtain the logio length of each seg- 
ment. Positive values of a denote clockwise 
turns; negative values counterclockwise (con- 
sistent reversal results in a mirror-image). In 
specifying values of (c), the symbol “<" is 
used to mean “less than .1," ie., that the 
angle is rounded to a slight, practically un- 
measurable, degree. 

+61, —.56, .3; —84, +.08, <; +27, —.04, 
1; +35, +.66, .2; +155, —.06, 1.0; —145, 
—.39, 0; +32, +.13, 1.0; +27, —.43, <; 
—56, +.61, .1; +107, —.38, .4; —69, —.12, 
1; +37, —.62, <; —88, +.38, 0; +105, 
+.11, 1.0; —79, +.33, .1; +86, +.34, .5; 
—20, —.14, <; —25, —.76, .4; —66, +.23, 
0; 1.0; +30, +.09, 0; +113, 
— .02,0; —56, +.15, .1. 

+43, +.14, .4; 4; -31, +.02, 
3; +37, +.21, 4; +36, —.21, .6; +41, —.41, 
.3; —50, +.45, 0; —6, —.18, <; +29, +.03, 
8; —43, +.26, <; +12, —.18, <; +88, 
—.11, 5; +23, —.27, 4; —124, +.31, .5; 
+90, —.12,..8; —29, —.28, <; +70, —.23, 
1.0; —114, +.39, 0; +66, +.16, .3; —65, 
—.04, 4; +44, —.17, <; —43, +.77, .7; 
+121, —.28, .9; —26, +.04, <; +105, —.08, 
7; —62, —.49, .6; +26, +.09, <; —113, 
+.13, .3; —25, +.25, .5; +44, —.26, .8; —83, 
—.24, 0; +19, +.42, 1.0. 


ee ° 
—55, —.15, 





MALCOLM D. ARNOULT 


scribed in the previous section. The 
system has several advantages: (a) 
It yields a description which does 
not vary with size and orientation. 
(b) Since the use of a general normal- 
izing factor is avoided, part-similari- 
ties between the contours of two ob- 
jects are reflected in their numerical 
descriptions. Likewise, repetitious 
sequences of elements in the same 
contour (but not parallel lines) are 
reflected, and could be quantified by 
an autocorrelational technique. If 
two similar shapes (e.g., an original 
and a subject’s reproduction from 
memory) were compared by cross 
correlation of their numerical de- 
scriptions, it would be desirable to 
calculate values for several ‘‘displace- 
ments” of one set of coordinates upon 
the other (as in autocorrelation), in 
order to allow for qualitative omis- 
sions or additions of elements.  (c) 
There is reason to believe that the 
number of tricoordinates required to 
describe a shape constitutes a first 
order approximation of its psycho- 
logical complexity (i.e., the number 
of psychologically discrete parts which 
it contains). Fehrer (14) used a simi- 
lar measure (number of internally 
homogeneous lines) on her figures, 
and found that complexity, so meas- 
ured, was closely related to difficulty 
in a reproduction-learning situation. 
Attneave (5) recently confirmed that 
the number of sides in a polygon is 
the primary determinant of its judged 
complexity. A better approximation 
would require some adjustment for 
repetitious sequences of elements, 
mentioned above (see Rashevsky, 
25, p. 486 ff.; also Attneave, 3, 4). 
The major disadvantage of the sys- 
tem is that some figures (spirals are 
an obvious case) do not yield unique 
descriptions. This limitation arises 
inevitably from the approximation 
of all curves with straight lines and 








STUDY OF SHAPE AND PATTERN PERCEPTION 


arcs, and the ignoring of higher-order 
invariances. It is interesting to specu- 
late that the system might be to some 
degree psvchomimetic even in this 
limitation, and that objects for which 
it does not vield unique descriptions 
are less likely to evoke reliable per- 
ceptual responses, with the result that 
they may be perceived as ‘‘amor- 
phous” or “unstable,” and be diffi- 
cult to remember. 

Measuring operations like the fore- 
going, which involve following about 
a contour, are laborious to accom- 
plish manually. It appears, however, 
that they are quite amenable to 
automation by electronic and me- 
chanical means. For example, an 
electronic contour-follower, described 
by Beurle (7), has already been con- 
structed. A point of light is moved 
rapidly through a very small circle; 
when its path crosses the contour, a 
signal is obtained. The phase rela- 
tionship of this signal to the circular 
movement is used to guide the circle 
along the contour, 1.e., to move the 
point of light about the contour in a 
eveloidal path. A record of the move- 
ment of the circle, taken from the 
servo control loop, constitutes a de- 
scription of the shape which may 
further be-transformed and analyzed 
by computer-type circuits. 

Measurement of  gestalt-variables. 
We have been considering analytical 
systems by means of which the for- 
mal properties of contours may be 
described in detail. Also of interest 
is another set of variables which do 
not provide a description from which 
the shape can be reconstructed, but 
which do abstract important prop- 
erties of the shape as a whole. We 
shall refer to these as “‘gestalt-varia- 
bles,”’ or ‘‘gestalt-measures,”” even 
when they serve to summarize some 
quantizing or analytical process: e.g., 
the number of sides in a polygon is 


467 


such a variable; so is the number of 
tri-coordinates necessary to describe 
a shape by the system discussed 
above. Likewise, the mean value of 
the c-coordinate imthat system might 
be taken as a crude measure of over- 
all curvedness-vs.-angularity. It 
should be clear that the “statistical 
parameters” of populations of shapes, 
referred to earlier, necessarily pertain 
to distributions of certain gestalt- 
measures. 

The more restricted notion of a 
gestalt as a system in which every 
part is affected by every other part 
has been incorporated by Rashevsky 
(25, p. 451 ff.) into a hypothetical 
nerve-net. Suppose that the contour 
of an object is projected to some sheet 
of neurons inthe cortex as an iso- 
morphic excitation (Rashevsky’s 
mechanism for contour-abstraction 
has already been described). Sup- 
pose further a distribution of inhibi- 
tory fibers such that, in the next 
higher projection area, every point 
on the contour (i.e., every excited 
neuron) receives inhibition from ev- 
ery other point in an amount which 
varies as a function (presumably de- 
creasing) of the distance between the 
points. At this level, the various 
neurons to which the contour is pro- 
jected will retain more or less residual 
excitation, depending upon the de- 
gree to which each is isolated from 
the others. A given contour will be 
characterized (though not uniquely) 
by some distribution of residual exci- 
tations which will be invariant with 
respect to its place and orientation 
in the field (but not with respect to 
its size). The integral, or mean, of 
this distribution would constitute a 
measure of the “simplicity’’ or com- 
pactness of the figure (provided size 
were held constant, or corrected for); 
e.g., a circle would have the highest 
possible value, since its points are as 





468 


far from one another as is possible in 
a closed contour, and jagged or sinu- 
ous shapes would have low values 
(see Householder, 18). The neuro- 
logical terms in which this model is 
presented need not be taken too seri- 
ously; Rashevsky's basic idea might 
equally well be applied to the pro- 
gramming of a man-made computer, 
or to a series of photographic opera- 
tions. 

Deutsch (12) has recently sug- 
gested a model for shape perception 
which is somewhat akin to Rashev- 
sky's. Since it may be described very 
simply in terms of geometrical con- 
cepts, we shall ignore the neural 
mechanisms which Deutsch proposes 
as its basis. Suppose that a perpen- 
dicular is drawn to a closed contour 
at every point along its length. Each 
such perpendicular will contain a 
segment which lies inside, and is 
bounded by, the contour. The 
lengths of these segments will have 


some distribution which will depend 
upon the shape of the contour; this 


distribution may be rendered size- 
invariant by expressing the length 
of each segment as a proportion of 
the length of the contour. In the case 
of a circle, a square, or any other 
regular polygon with an even number 
of sides, the distribution will consist 
of a single spike, since all the seg- 
ments will be of equal length. 
Deutsch suggests that the most prim- 
itive mechanism of form-discrimina- 
tion may abstract a distribution of 
this sort; at the human or primate 
level it would obviously need supple- 
menting with some finer mechanism, 
perhaps one involving contour-fol- 
lowing. He points out that rats have 
more difficulty discriminating a 
square from a circle than from a tri- 
angle, and predicts further that regu- 
lar polygons with even numbers of 
sides should be more difficult to dis- 


FRED ATTNEAVE AND MALCOLM D. ARNOULT 


criminate from one another than 
from odd-sided polygons. 

Merely to order shapes along a 
compactness-dispersion continuum 
requires nothing so elaborate as the 
Rashevsky model outlined above. 
The relationship of the perimeter of a 
shape to its area provides an attrac- 
tively simple means of measuring 
this characteristic. The quotient 
P/A, which has been employed by 
some investigators (8, 17), is unsatis- 
factory from our standpoint because 
it varies with size as well as with 
shape, but either P?/A or P/V/A is 
size-invariant. These ratios may be 
transformed in various ways to suit 
the user’s convenience; e.g., the meas- 
ure 


2V4 A 
D=1-— 


P 


expresses dispersion as some number 
between zero and one, assigning zero 
value to the most compact figure pos- 
sible, the circle. Dispersion (as meas- 
ured by any such relationship of 
perimeter to area) is not the same as 
complexity (in the sense of number of 
parts). Although a deeply convoluted 
or jagged figure will indeed tend to 
have a high dispersion value, so will 
a very thin rectangle or ellipse. 
Bitterman, Krauskopf, and Hoch- 
berg (8, 22) have found that under 
conditions of low illumination § or 
short exposure, shapes are perceived 
in much the same way as if they were 
physically diffused, or blurred. These 
experimenters created a physical dif- 
fusion model by cutting filter paper 
into various shapes and impregnat- 
ing it with an inhibitor of bacterial 
growth. This inhibitor was then al- 
lowed to diffuse from the paper into 
bacterial cultures. The shapes which 
most resembled each other after dif- 
fusion were those most often confused 








STUDY OF SHAPE AND PATTERN PERCEPTION 


under adverse viewing conditions. 
Likewise, identification of impover- 
ished stimuli was most impaired in 
the case of shapes characterized by 
relatively small detail, which would 
be averaged out in a diffusion process. 

These findings are interesting and 
important, but the clumsy and some- 
what bizarre bacterial model does 
not lend itself to quantitative predic- 
tion. There is no apparent reason 
why it might not be replaced with a 
model employing optical blur, in 
which case diffusion would be meas- 
ured by the radius of the blur circle. 
An image may readily be blurred to 
a measurable degree in an ordinary 
photographic enlarger, and then re- 
sharpened by means of high-contrast 
paper or film (cf. Method 5 under 
‘The Construction of Stimuli’). This 
resharpening process introduces an- 
other parameter, that of the black- 
white threshold to be used in print- 
ing. It is easiest photographically 
simply to employ long exposure and 
development, with the result that a 
white-on-black figure will diffuse 
outward into the field to the full ex- 
tent of the radius of the blur circle. 
If it is desired that concavities and 
convexities be affected symmetri- 
cally, however (note that a psycho- 
logical question requiring an empiri- 
cal answer is thus raised), it is neces- 
sary to resharpen the image into 
black and white about some inter- 
mediate gray such that a linear con- 
tour between black and white fields 


will be restored to its original posi- 


tion. This may be accomplished 
with the aid of a suitable test-figure. 


* Dinneen (13) has succeeded in program- 
ming a digital computer to perform averaging- 
and-resharpening operations of almost ex- 
actly this sort. His paper, which contains 
copious illustrations of the effect of varying 
resharpening threshold, is recommended to 
the reader who finds the above discussion in- 
sufficiently informative. 


469 


Over a wide range of values on the 
resharpening threshold parameter, 
the process of blurring and resharpen- 
ing will decrease the dispersion 
(P?/A, or D) of any shape except a 
circle, which is already the most com- 
pact shape possible. For any such 
value, dispersion will tend to decrease 
as amount of blur increases, but the 
form of this function—which we shall 
call a blur-response function—will 
vary with the shape involved and 
will describe certain important char- 
acteristics of the shape. Since the 
decrease in the function is associated 
with the ‘“‘washing out” of progres- 
sively larger detail as the blur circle 
increases in size, any sharp drop indi- 
cates that the shape contains con- 
siderable detail of a magnitude indi- 
cated by the blur circle at that point. 
The blur-response function (or, per- 
haps better, its derivative) is thus a 
potential aid in the statistical evalua- 
tion of ‘magnitude of critical detail,” 
which Bitterman, et al., found to be 
of primary importance in determin- 
ing the identifiability of an impover- 
ished shape (8). A full exploration of 
the properties of such functions (par- 
ticularly in the case of shapes char- 
acterized by certain types of regu- 
larity, or redundancy) is beyond the 
scope of this paper; our purpose here 
is merely to suggest their feasibility 
and possible usefulness. One further 
point should be made, however; 
neither the blur-response function 
nor any other gestalt measure can 
possibly predict the relative identifi- 
ability of shapes except in a limited, 
statistical way. The kinds and de- 
grees of similarity which an impov- 
erished shape bears to all the other 
shapes with which it might be con- 
fused will clearly affect the difficulty 
with which it is identified (quite 
apart from any intrinsic properties 
it may have), and these similarities 





470 FRED ATTNEAVE AND 
may be evaluated, if at all, only by 
recourse to analytical measures. A 
particular detail in a shape may or 
may not be critical to identification, 
depending upon the specific discrimi- 
nations which identification requires. 

Gestalt measures, as defined earlier, 
all involve a reduction in the dimen- 
sionality of figures (sometimes, though 
not necessarily, to a single dimension) 
with a concomitant discarding of in- 
formation. The number of operations 
by means of which a shape may be 
“collapsed”’ to lower dimensionality 
is indefinitely large, as Selfridge (26) 
has recently pointed out. At the 
simplest level, for example, we may 
literally collapse a shape upon any 
spatial axis by plotting, as a function 
of distance along that axis, the thick- 
ness of the shape in the orthogonal 
dimension (26, Fig. 3). The axis in- 


volved need not even be linear; e.g., 
it might be a circle about the center 
of gravity of the shape (cf. Pitts and 


MALCOLM D. ARNOULT 


McCulloch, 24). 

Of all the conceivable physical 
measures of shape, analytical as well 
as gestalt, there are undoubtedly 
many that have little or no value 
from a psychophysical point of view. 
On the other hand, it appears un- 
likely that any single system of physi- 
cal measurement can be optimal for 
all psychophysical situations: in other 
words, we are suggesting that form 
perception involves a number of dif- 
ferent psychological mechanisms 
which function in a complementary, 
and to some degree overlapping, man- 
ner. Unfortunately, there is no quick 
and easy way to determine which 
physical measurements have greatest 
psychological relevance; only experi- 
mentation can answer this question. 
The preceding discussion and review 
may at least serve, however, to‘allevi- 
ate somewhat the paucity of hypoth- 
eses which in the past has charac- 
terized this research area. 


REFERENCES 


. ARNoutt, M. D. Recognition of shapes 
following paired-associate pretraining. 
Paper read by title at USAF--NRC 
Sci. Sympos., Washington, D. C., No- 
vember, 1955. 

. Arnoutt. M. D. A comparison of train- 
ing methods in the recognition of spatial 
patterns, USAF Personnel Train. Res. 
Center, Res. Rep., No. AFPTRC-TN- 
56-27. 

. ATTNEAVE, F. Some informational as- 
pects of visual perception. Psychol. 
Rev., 1954, 61, 183-193. 

. Atrngave, F. Symmetry, information, 
and memory for patterns. Amer. J. 
Psychol., 1955, 68, 209-222. 

. ATTNEAVE, F. Physical determinants of 
the judged complexity of shapes. J. 
exp. Psychol. (in press). 

. ATTNEAVE, F. Effect of familiarization 
with a class-prototype on identification- 
learning of shapes. Amer. Psychologist, 
1955, 10, 400-401. (Abstract) 

. Beurve, R. L. Discussion in W. Jackson 
(Ed.), Communication theory. New 


York: Academic Press, 1953. Pp. 323 
326. 

8. Bitrerman, M. E., Krauskopr, J., & 
HocnsBerG, J. E. Threshold for visual 
form: a diffusion model. Amer. J. 
Psychol., 1954, 67, 205-219. 

. Brunswix, E. Systematic and representa- 
tive design of psychological experiments: 
with results in physical and social per- 
ception. Berkeley: Univer. Cali- 
fornia Press, 1947. 

. Brunswik, E. The conceptual framework 
of psychology. Chicago: Univer. of Chi- 
cago Press, 1952. 

. Brunswik, E., & Kamiya, J. Ecological 
cue-validity of ‘“‘proximity’ and of 
other Gestalt factors. Amer. J. Psychol., 
1953, 66, 20-32. 

. Deutscn, J. A. A theory of shape recog- 
nition. Brit. J. Psychol., 1955, 46, 30- 
3. 

. DINNEEN, G. P. Programming pattern 
recognition. Cambridge: Massachu- 
setts Inst. of Technology, Lincoln Lab., 
1955. 

. Fenrer, E. V. 


of 


An investigation of the 








STUDY OF SHAPE AND PATTERN PERCEPTION 471 


learning of visually perceived forms. 

Amer. J. Psychol., 1935, 47, 187-221. 

. Fitts, P. M., Wetnstern, M., Rappa- 

PORT, M., ANDERSON, N., & LEONARD, 

A. J. Stimulus correlates of visual pat- 

tern recognition. J. exp. Psychol., 1956, 

$1, 1-11. 

. Frencu, R.S. The discrimination of dot 
patterns as a function of number and 
average separation of dots. J. exp. 
Psychol., 1953, 46, 1-9. 

HocHBERG, J. E., GLErTMAN, H., & Mac- 
Brive, P. D. Visual thresholds as a 
function of simplicity of form. Amer. 
Psychol., 1948, 3, 341-342. (Abstract 

HovusenHouiper, A. SS. Concerning Ra- 
shevsky'’s theorv of the “Gestalt.” 
Bull. Math. Biophys., 1939, 1, 63-73. 

. KAUFMANN, E. L., Lorn, M. W., REEsE, 

T. W., & VoLKMANN, J. 

nation of visual number Amer. J 

Psychol., 1949, 62, 498-525 

. Kremer, E. T., & Frick, F.C. Assimi- 

lation of information from dot and 

matrix patterns. J. exp. Psychol., 1953, 


45, 15-19. 


The discrimi- 


. Krauskopr, J., 


21. Kovasznay, L. S. G., & Josepn, H. M. 


Processing of two-dimensional patterns 
by scanning techniques. Science, 1953, 
118, 475-477. 

Duryea, R. A, & 
BiITTERMAN, M. E. Threshold for vis- 
ual form: further experiments. Amer. J. 
Psychol., 1954, 67, 427-440. 


. LaBerce, D. L., & Lawrence, D. H. A 


method of generating visual forms of 
graded similarity. Amer. Psychologist, 
1955, 10, 401. (Abstract) 


. Prtrs, W., & McCuttocu, W. S. How 


we know universals. The perception of 
auditory and visual forms. Bull. Math. 
Biophys., 1947, 9, 127-147. 

RASHEVSKY, N. Mathematical biophysics. 
Chicago: Univer. of Chicago Press, 
1948. 


. SELFRIDGE, O. G. Pattern recognition and 


learning. Cambridge: Massachusetts 
Inst. of Technology, Lincoln Lab., 1955 

PHompson, D. W. Growth and form. New 
York: MacMillan, 1942. 


Received March 19, 1956. 





PSYCHOLOGICAL BULLETIN 
Vol. 53, No. 6, 1956 


THE NORMAL CURVE AND THE, ATTENUATION 
PARADOX IN TEST THEORY 
LLOYD G. HUMPHREYS! 
University of Illinots 


The appearance of Loevinger’s 
paper (2) on the attenuation paradox 
in test theory was the precipitating 
factor in the writing of this note.? In 
reacting to her development of the 
paradox (supposed lack of monotonic 
relationship between reliability and 
validity) certain biases concerning 
test theory and test statistics which 
the writer has held for several years 
were crystallized. 

Bias Number One. Let's forget our 
fixation on the normal curve in test 
theory. 

Bias Number Two. 
tistics 


Let’s use sta- 
appropriate for rank-order, 
point distributions. 

In support of these biases the fol- 
lowing two arguments are offered: 

1. Test distributions are 
rank-order, point distributions. The 
underlying trait may or may not be 
continuously and normally distrib- 
uted, but such speculation is of no 
import. Psychological tests furnish 
rank-order information only and, 
furthermore, we have few prospects 
of obtaining devices of any other 
type. Criteria, on the other hand, 


score 


1 Visiting professor. University of Illinois, 
fall semester, 1955; on leave from Personnel 
Research Laboratorv, Air Force Personnel and 
Training Research Center. This article is 
based in part on work done under ARDC 
Project No. 7702 in support of the research 
and development program of the Air Force 
Personnel and Training Research Center, 
Lackland Air Force Base, Texas. Permission 
is granted for reproduction, translation, pub- 
lication, use, and disposal in whole or in part 
by or for the United States Government. 

* The writer is indebted to Drs. Robert 
Travers, John Leiman, and John Schmid, 
Jr., for critical reading of this manuscript. 


472 


may occasionally be continuously 
distributed and certain of these dis- 
tributions may be normal, but cri- 
teria also are more frequently in the 
form of tests, ratings, rankings, pass- 
fail, and other point distributions. 

2. If no assumption is made con- 
cerning the shape of the criterion 
distribution in the work of Loevinger 
(2), Brogden (1), and Tucker (4), 
there is no paradox. For example, if 
all items in a test have difficulty val- 
ues of .5 and if all intercorrelations 
of items are equal, the relationships 
contained in Table 1 between num- 


TABLE 


VALIDITY AS A MoNnoToNIc FUNCTION 
OF RELIABILITY 


Item 
Valid- 
ity 


Item 
Inter. 


9 45 90 
Items Items Items Items 
73 .83 .92 .96 
R85 91 96 98 
90 95 98 99 
.o4 97 _99 993 
96 98 991 996 
97 _99 994 997 
98 994 997 9908 
993 .997 .999 .999 


CO) OD) Ut) be Gs 


ber of items, item reliability, or level 
of interitem correlations, and validity 
of total scores are obtained. It is 
seen that the relationship between 
reliability and validity 7s monotonic. 

Discussion. In obtaining the above 
results the same assumption about 


item validity made by preceding 
writers was used, i.e., each item ex- 
cept for grrors of measurement is a 
true measure of the criterion. This 








NORMAL CURVE AND ATTENUATION PARADOX IN TEST THEORY 


means that the validity of an item is 
the square root of its reliability. In 
the present case reliability is indi- 
cated by the phi coefficient between 
items in the test, and the validity is 
a point biserial between the item and 
“true” score. The values in the table 
are those obtained by applying the 
usual formula for the correlation of 
sums. Please note that here and else- 
where, when the term “correlation”’ 
is used, a product-moment correla- 
tion is assumed. 

The reader may have difficulty vis- 
ualizing the shapes of these criterion 
distributions since the definition of 
true is tied to the concept of infinity. 
The amount of error isn’t great in 
any derivation, however, if one sub- 
stitutes a large number for infinity.’ 
One thousand items will give results 
reasonably comparable to infinity 
—ten thousand would be eminently 
and the shape of the distribu- 
tion can actually be worked out. Suf- 
fice to say, however, that a criterion 
distribution, as defined for Table 1, 
will not be normal when all items 
have difficulty values of .5 unless 
item intercorrelations are zero. For 
the same item difficulty specifica- 
tions the distributidn becomes rec- 
tangular when item correlations reach 
4 and becomes increasingly U shaped 
as item correlations increase from } 
to 1.00. 

The importance of the assumption 
concerning normality of criterion dis- 
tribution is made clear in Table 2. 


sale 


* The mathematician, H. T. Davis, made 
this suggestion in principle in a class at 
Indiana University in 1935-36. He stated 
that if mathematicians substituted a ‘very 
large number” for infinity in their calculations 
they would not obtain significantly different 
answers and their assumptions would have 
operational meaning. This suggestion seems 
peculiarly appropriate for test theory. For 
the latter theory the number does not need 
to be nearly as large as Dr. Davis envisioned. 


473 
TABLE 2 


COMPARISON OF ITEM VALIDITIES WITH AND 
WHITHOUT THE ASSUMPTION OF CRI- 
TERION DISTRIBUTION NORMALITY 


Com- 
| parison 
Values 


Item Relia- Item Validities 
bilities 

| ? pbis 
com- 
puted 
from 


Tois 


316 251 
447 3! 358 
548 438 
632 Z 506 
707 : 566 
775 620 
837 669 
894 716 
949 759 


Phi 


V let, 


OT This 


Vphi, | 
OF Tpbis 


} 
! 
| 
; 


This table was constructed by first 
certain item) reliabilities 
stated in terms of the tetrachoric cor- 
relation. These values are in Column 
1. Column 2 contains the correspond- 
ing phi coefficients for the same items. 
Column 3 contains the item validi- 
ties, stated as continuous biserials, 
when the criterion is assumed to be 
a true, normally distributed measure 
of the function measured by the 
item. Values in Column 3 are com- 
parable to item validities used previ- 
ously by Brogden and Loevinger. 
Column 4 also contains item validi- 
ties, stated as point biserials, but the 
criterion is assumed to be the sum of 
an infinite number of items, of a 
given level of reliability, whose dis- 
tribution takes the shape dictated 
by their intercorrelations. Column 
5 also contains point biserials, but 
these were computed from the con- 
tinuous biserials in Column 3, which 
were based on an assumed normal 
distribution, by multiplying each by 
the expression z/\/ pg. Comparison 


assuming 





474 


of Columns 4 and 5 shows how the 
assumption of normality attenuates 
item validities with the error becom- 
ing progressively larger with higher 
validities. 

Similar tables can readily be com- 
puted for other levels of item diffi- 
culty. 
but the are 
skewed as well as flat when item inter- 
correlations greater than zero. 
The assumption of a normal distribu- 
tion of the criterion is not compatible 
with the mechanics of test 
items together. 


Again, there is no paradox, 
criterion distributions 


are 


adding 


The problem is more complex if 
item difficulties vary. Item relia- 
bility can no longer be estimated from 
the intercorrelations of the items in 
the test, but must be defined as cor- 
relation with a comparable item in 
another test. The comparable item 
must measure the same function and 
must have the same mean and vari- 
ance. Intercorrelations of items hav- 
ing different means and variances, 
but otherwise measuring the same 


function equally reliably, will be 


lower than the products of the square 
roots of their reliabilities. 

With items distributed in difficulty 
there is again no paradox, however, 
since spread of item difficulties will 
also affect the shape of the criterion 


distribution. Variance of item difh- 
culties forces scores toward the center 
of the distribution and thus counters 
the effect of item intercorrelations. 
It is still possible to argue that a 
paradox is involved since classical 
test theory does not allow for the 
flexibility in shape of distribution re- 
quired in order for the classical for- 
mulas to be applicable. The locus 
of the paradox can, however, be more 
precisely stated. In order for the rela- 
tionship between validity and reli- 
ability to hold, one cannot keep con- 
stant both the form of the criterion 


LLOYD G. HUMPHREYS 


distribution and the distribution of 
difficulties of the test items. 

Conclusion for test construction. 
The test technician should proceed 
with the job of test construction with- 
out making obeisance to the normal 
curve. His decisions should be made 
in sequential fashion from most to 
least important. 
test 
made late in the sequence and his de- 
about shape of distribution 
should not lead to reversals of earlier, 
It should 
also be noted that all of his decisions 
are made with a particular group of 
examinees in mind, the level 
and range of their ability are crucial 
factors in the writing and selection of 
test items. 

The first step in test construction 
is to draw up specifications for the 
test. Decisions made at this stage 
should not be changed, uncon- 
sciously, by later statistical computa- 


The shape of his 
score distribution is a decision 


sires 


more important decisions. 


since 


tions of the sort used in item analysis, 
test reliability, 
measures of test homogeneity. 


and 
Blind 
application of statistical procedures 
may change the nature of the test. 

For example, the test may be de- 
signed to predict 
plex criterion. 
cluded in 


measures of 


a particular com- 
Items will then be in- 
such that their 
weight in the total score will be opti- 
mum for the purpose. Selection of 
items on the basis of correlation of 


numbers 


items against total test score would 
obviously be inappropriate. A Kuder- 
Richardson homogeneity coefficient 
would also be inappropriate for the 
test as a whole. 

One may interested in 
measuring a psychological “trait.” 
In this case the tendency is to think 
of the problem in terms of the ho- 
mogeneity of the items on the 
grounds that a heterogeneous test by 
definition cannot measure a unitary 


also be 








VORMAL CURVE AND ATTENUATION PARADOX I 


trait. If homogeneity is defined as 
level of item intercorrelations, how- 
ever, there is again the possibility of 
error in the blind following of statisti- 
cal indices. Let us suppose that a 
information test is de- 
The following are possible ex- 
amples of such tests in descending 
order of item intercorrelations (dif- 
ficulty level of items being held con- 
stant): (a) Information about the 
crosscut saw and its use; (b) Infor- 
mation about saws and their uses; 
c) Information about woodworking 
tools and their uses; (d) Information 
about tools and their uses in wood- 
working, plumbing, metal working, 
automotive repairing, etc. 

For many purposes test d may be 
most desirable, though its homogene- 
itv as defined above is lower than for 
the other tests in the series. This 
that the test specifications 
must indicate how broadly this test 
should be Even a fairly 
broad test relatively homo- 
geneous, however, in that the itemsin 
the test may still be more like each 
other than items in other tests in the 
same battery (3). 

High item reliability is always de- 
sired. Nothing is gained from low 
reliabilities. The reader must 
remember, however, that item relia- 
bility the correlation 
with another comparable item, and 
i estimated from 
with all other items in the test. Hence 
there is no contradiction between the 
present advice to achieve high item 
reliability that given above 
which was to select a desired degree 
of homogeneity. The test constructor 
should, therefore, as his next step, 
write the most reliable items he pos- 
sibly can for the function he wants to 
measure. By necessity, though not 
from choice, item reliabilities will 
often be quite low because reliable 


mechanical 


sired. 


means 


detined. 
may be 


item 
is defined as 
correlations 


is not 


and 


TEST THEORY 475 
measurement in many areas is diffi- 
cult. 

One cannot be as dogmatic about 
high test reliability as about high 
item reliability. Test reliability is a 
function of item reliabilities and item 
intercorrelations; i.e., test reliability 
is in part a function of homogeneity. 
High test reliability can be achieved 
by narrowing the focus of the test 
and attaining high homogeneity. 
Care must be exercised in item selec- 
tion, therefore, not to confuse item 
reliability and homogeneity and 
thereby change the function meas- 
ured by the test. The test technician 
must maintain his original specifica- 
tions in spite of temptations to in- 
crease test reliability. 

The next decision the 
the distribution of test 
scores desired. Depending on the 
purpose of the test, the desired dis- 
tribution may be normal, platykurtic, 
skewed, or U shaped. For a general 
purpose test the writer submits that 
a rectangular distribution is most 
useful this distribution most 
accurately represents the information 
furnished by a_ psychological test. 
That is, the rank-ordering of persons 
is accomplished equally well in all 
parts of the range when the distribu- 
tion is rectangular. This means that 
reliability of discrimination is maxi- 
mized over all. 

The desired shape is achieved or, 
more commonly, approached by con- 
trolling the difficulty levels of the 
test items. Item difficulties alone are 
manipulated because previous deci- 
sions have fixed the general level of 
item intercorrelations that are possi- 
ble. With high item intercorrela- 
tions, a constant level of item diff- 
culty will produce a U-shaped distri- 
bution. As the variance of item diffi- 
culty increases, the peaks of the U- 


concerns 


shape of 


since 





476 


shaped distribution will converge to 
the center of the distribution. 

The reader should be warned that 
some of the shapes of test score dis- 
tributions are highly theoretical in 
terms of the characteristics of items 
available for most measurement pur- 
poses. One practical outcome, how- 
ever, is to question the decision made 
automatically by most test con- 
structors to vary the difficulty levels 
of the items in the test. With low 
item intercorrelations of the sort ob- 
tained in most aptitude tests only by 
careful selection of the most reliable 
items at a constant level of difficulty 
can a rectangular distribution be 
approached. 


SUMMARY 


1. The attenuation paradox in 
test theory is a result of the assump- 
tion made by previous writers of a 
continuous normal distribution of the 
criterion. 


2. There is no paradox if the cri- 
terion distributions can assume any 


shape. If this is considered ipso 
facto paradoxical, then the locus of 
the paradox is in one’s inability to 
hold constant both the shape of the 
criterion distribution and the distri- 
bution of item difficulties. 

3. The pervasive use of the as- 
sumption of continuous normal dis- 


LLOYD G. HUMPHREYS 


tributicus in test theory and test 
statistics is questioned on grounds 
that test data are in the form of rank- 
order, point distributions. 

4. The test technician should make 
decisions in constructing a test in a 
particular sequence. This sequence 
is as follows: 

a. Outline his test specifications. 
This will specify the desired degree 
of homogeneity (level of item inter- 
correlations) wanted in the test. High 
homogeneity is not necessarily de- 
sirable. 

b. Write the most reliable items 
possible to measure the desired func- 
tion or functions. Items of low relia- 
bility are never desired. 

c. Do not always try to maximize 
test reliability, since the latter is a 
function both of item reliability and 
homogeneity. The desired degree of 
homogeneity should be maintained 
even if item-test correlations are low. 

d. Select the form of the raw score 
distribution of test scores desired. 
This can be any form, though a rec- 
tangular distribution is recommended 
for a general purpose test 

e. Strive to obtain the desired form 
of distribution by varying item dif- 
ficulties only. Previous and more im-. 
portant decisions have fixed the level 
of item intercorrelations which is the 
other determiner of shape of distribu- 
tion. 


REFERENCES 


BroGpeEN, H. E. Variations in test validity 
with variation in the distribution of item 
difficulties, number of items, and degree 
of their intercorrelation. Psychometrika, 
1946, 11, 197-214. 

. LOEVINGER, JANE. The attenuation para- 
dox in test theory. Psychol. Bull., 1954, 

51, 493-504. 
3. LoEVINGER, JANE, GLESER, GoLpINE C., 


& DvuBots, P. H. Maximizing the dis- 
criminating power of a multiple score 
test. Psychometrika, 1953, 18, 309-317. 

4. Tucker, L. R. Maximum validity of a 
test with equivalent items. Psycho- 
metrika, 1946, 11, 1-13. 


Received December 6, 1955. 








PSYCHOLOGICAL BULLETIN 
Vol. 53, No. 6, 1956 


THE ABILITY OF HUMAN OPERATORS TO DETECT 
ACCELERATION OF TARGET MOTION?! 


ROBERT M. GOTTSDANKER 
Santa Barbara College, University of California 


In the tracking task, or indeed in 
any task requiring adjustment to 
moving objects, the operator is often 
confronted with targets which, while 
preserving their general directions, 
change their velocities. It is the ob- 
ject of this survey to describe the ex- 
perimental literature which deals 
with responses that are made to such 
accelerated motion. Of particular 
interest in relation to tracking be- 
havior is the extent of acceleration 
which must occur in order for it to 
be noticed. The tracker’s ability to 
match target velocities with his own 
movements must depend in part on 
his sensitivity to change in velocity. 
Other information, though less obvi- 
ously applicable may help in com- 
pleting the picture of how ¢he op- 
erator responds to changing veloci- 
ties. 

All of the findings on the topic of 
response to target acceleration that 
the writer has been able to unearth 
are included in the present review. 
Actually, very few studies have dealt 
with this problem. Further, it was 
not always the primary focus of the 
investigation. For this reason, in dis- 
cussing a study scant consideration 
may be given to its major objective. 
Instead, the aspects which bear on 
the present subject matter will be 
emphasized. 


! This research was supported by the United 
States Air Force under Contract No. 33(616)- 
2024 with the University of California, moni- 
tored by the Aero Medical Laboratory of 
Wright Air Development Center. Permission 
is granted for reproduction, publication, use 
and disposal, in whole or in part by or for the 
United States Government. 


The making of a systematic evalu- 
ation of the present status and future 
possibilities of work on response to 
acceleration necessitates the locating 
of this problem within the more gen- 
eral framework of response to target 
motion. To do this it will be neces- 
sary to describe some aspects of stud- 
iesonconstant-velocity motion. How- 
ever, there is no intention of making 
the present survey broader than is 
shown by the title. Consequently 
such important topics as perceived 
motion from discrete stimuli, induced 
motion, one’s interpretation of his 
own relative motion, etc., will receive 
no mention. A fairly strict limitation 
of subject matter is mandatory be- 
cause the problems of perception and 
action in relation to motion include 
all of the variables of the stationary 
environment in addition to those in- 
troduced by motion. 


DESCRIPTION OF RELEVANT STUDIES 
The 
classical complication experiment, 
with its ancestry in the personal 
equation of Bessel (2, p. 133), is the 
first of the situations in which the 
effect of acceleration of motion on 
judgment was studied. Wundt (21), 
using a complication pendulum, at- 
tempted to judge the position of the 
pointer at the sound of a bell stroke. 
In Wundt’s arrangement, the pointer 
oscillated symmetrically about the 
straight-up position. Figure 1 shows 
a top-pointing pendulum at two posi- 
tions of its motion in the upper 
sketches. It was found by Wundt 
that during the positively accelerated 


477 


The complication experiment. 





478 


upward phase, errors of judgment 
were negative, i.e., the judged posi- 
tion was an earlier one than the point 
of actual coincidence. The black 
circle in the left-hand sketch illus- 
trates such a judgment. During the 
subsequent negatively accelerated 
downward phase, errors tended to be 
positive. An example is shown in the 
sketch on the right. In addition to 
these results, Wundt found negative 
errors to predominate for slow mo- 
tions and positive errors for rapid 
motions. Intermediate speeds were 
found which had no over-all constant 
error. These last findings were in 
agreement with the results obtained 
when he used a complication clock, 
which moves with constant angular 
velocity. 

Subsequent investigations by von 
Tschisch (20) and by Pflaum (18), 
also using the complication pendu- 
lum, support Wundt’s findings in the 
major respects. Von Tschisch utilized 
sense modes in addition to sound, 
e.g., touch, for his instantaneous 
stimulus and also used combinations 
of stimuli. Geiger’s studies (6) added 
several controls to the earlier work. 
Most important was to use Ss other 
than the experimenter (seven in 
number) and to vary the orientation 
of the pointer and semicircular scale 
on different groups of trials by optical 
methods. Using a constant-velocity 
complication clock, he found nega- 
tive errors to be typical for upward 
movement and positive errors for 
downward movement, regardless of 
whether these phases occurred before 
or after the midpoint of the motion, 
or whether the pointer moved clock- 
wise or counterclockwise. Doubt was 
thus cast upon the importance of ac- 
celeration in producing negative or 
positive errors. In attempting to re- 
solve this question, Klemm (16) em- 
ployed both a top-pointing and a 


ROBERT M. GOTTSDANKER 


bottom-pointing pendulum. Hefound 
that differences in the two phases of 
motion were more marked for the 
top-pointing pendulum. From this 
he concluded that both the sign of 
acceleration and the direction of 
movement were operating factors. 
In the top-pointing pendulum the 
factors work in the same direction 
and thus reinforce one another, but 
in the bottom-pointing pendulum 
they work in opposite directions and 
are in conflict. Reference to Fig. 1 





TOP ~ POINTING ‘| 





cunt @ noon ] 
—a Gear cow 





DOWNWARD PASE 
NEGATIVELY ACCELERATED 
——————_______— = — ——__—_— 


a BOTTOM - POINTING 
a — = 





soul 


JPEARD PHASE 
NEGATIVELY ACCELERATED 


— | 





Fic. 1. RELATIONSHIPS IN EXPERIMENTS 
ON THE COMPLICATION PENDULUM 


shows that in the .op-pointing pendu- 
lum the two conditions which have 
been described as making for nega- 
tive error, upward motion and _ posi- 
tive acceleration, together. 
Similarly, downward motion and 
negative acceleration, both of which 
are described as making for positive 
errors, also coincide. For the bottom- 
pointing pendulum, the opposing 
conditions coincide: downward mo- 
tion and positive acceleration; up- 
ward motion and negative accelera- 
tion. 


occur 








DETECTION OF ACCELERATION OF TARGEI 


Burrow (5), in analyzing previous 
work on the complication experiment 
in conjunction with his own complica- 
tion-clock experiments, concluded 
that the effect of acceleration had 
never been demonstrated. It was his 
contention that the direction of error 
on the complication pendulum could 
be explained by the attraction of the 
midpoint of the scale, which becomes 
a kind of goal. However, such a 
central tendency should bring about 
a positive error during the initial 
phase and a negative error during the 
final phase. The results obtained by 
Wundt and his successors were pre- 
cisely the opposite. 

Prediction-motion. Continuative 
responses have been studied in a se- 
ries of experiments by the writer (8, 
9,10). The S tracked a small mov- 
ing target in a_ paper-and-pencil 
tracking box. On some trials the tar- 
get had a constant velocity of motion 
but on others the motion was either 
positively or negatively accelerated. 
The special task given to S was to 
continue his tracking responses as if 
he were tracking an airplane which 
had momentarily disappeared behind 
a cloud. The general outcome, as far 
as the accelerated paths of motion 
are concerned, is that S’s continua- 
tions were made at a constant veloc- 
ity rather than at an accelerated one; 
further, this velocity did not match 
the terminal velocity of the visible 
target. For positively accelerated 


motions, the continuation velocity 
was lower than the terminal velocity, 
and on negatively accelerated mo- 


tions it was higher. This was inter- 
preted as reflecting an averaging or 
integration of preceding velocities. 
Threshold for sudden changes in 
velocity. The detection of sudden 
changes in target rate has been stud- 
ied by Hick (12). In one procedure, 
called the “drum method,” target 


MOTION 


motion was generated by passing an 
endless belt on which were printed 
sloping lines under a mask which had 
an open slot perpendicular to the 
path of motion. The target moved 
at a constant velocity in the slot for 
between two and four seconds, at 
which time the velocity changed 
instantaneously. The S was to indi- 
cate each time he saw a change and 
to indicate whether it was an increase 
or a decrease. The relative mean 
threshold of change (0.5 probability) 
was 12 per cent of the initial velocity 
when the angular velocity of motion 
was 4.18°/sec. but increased as lower 
target velocities were employed, the 
corresponding values being 41. per 
cent for the 0.38°/sec. velocity and 
133 per cent for the 0.11°/sec. veloc- 
ity. It was Hick’s feeling that the 
12 per cent value at the most rapid 
motion represented a true threshold 
but that the high values for the 
slower motions were artifacts. Dur- 
ing a crossing with slow motion there 
were often a number of changes of 
velocity. The fact that S's errors on 
slow crossings were usually failures 
to respond rather than the making 
of incorrect responses is indicative 
of inattentiveness. 

In the other procedure, called the 
“oscilloscope method,”’ targets were 
generated on the face of the tube. 
Here S was shown the position where 
the change would take place. He also 
could hear the click of a relay when 
the change did occur. Again, S’s 
task was to indicate whether the 
change was an increase or a decrease 
in velocity. For increases in velocity, 
the relative threshold values ranged 
from slightly below 10 per cent to 
slightly above 13 per cent for veloci- 
ties lying between 0.15 and 10.25°/ 
sec. For decreases in velocity, thresh- 
old values over the same zone of ve- 
locities ranged from above 7 per cent 





480 


to about 21 per cent. In some cases, 
changes of as little as 2.5 per cent 
were detected significantly better 
than chance. Reducing the view- 
ing period to as short a time as 0.5 
sec. did not reduce the accuracy of 
judgments. However, presenting 
two targets which crossed before each 
changed velocity in identical fashion 
did elevate the threshold somewhat. 

Phenomenological description of har- 
monic motion. Two investigators 
have obtained phenomenological re- 
ports based on the presentation of 
harmonic motion. Metzger (17) used 
a technique in which a fixed light 
source threw shadows of moving ob- 
jects upon a translucent screen, on 
the other side of which was the S. 
One or more vertical rods mounted on 
a horizontal turntable provided tar- 
gets, each rod giving rise to a shadow 
which moved from side to side in a 
sinusoidal fashion. Metzger found a 
great preference for continuous paths 
of motion. For example, after two 
shadows joined together, the two 
“new” objects seen as arising were 
inevitably those which continued the 
previous directions and _ velocities 
rather than those which reversed or 
modified the directions and velocities. 

More pertinent for the present in- 
terest are the recent investigations by 
Johansson (14). This investigator 
also used a shadow technique in con- 
junction with a translucent screen. 
However, he pasted small target 
objects upon a transparent plate 
which was located between the light 
source and the translucent screen. 
The plate was moved horizontally in 
a harmonic manner by means of an 
eccentric drive. The importance of 
this study and of Johansson’s subse- 
quent major study (15) concerns the 
problem of perceptual organization 
of all major elements in a field, rather 
than observations on acceleration. 


ee 


a0 


ROBERT M. GOTTSDANKER 


Nevertheless, there is a clear state- 
ment regarding the perception of ac- 
celerated motion (14, p. 32). 


When an O is shown this kind of motion 
passing through a homogeneous field, and is 
asked how the velocity of the object behaves 
if different parts of the path of motion are 
compared one with another, or in other words, 
if the velocity changes along the path, practi- 
cally without exception the same answer is 
received: the point moves slowly just at the 
turning point; but otherwise its velocity is 
constant. 


DISCUSSION 


General observations on studies re- 
ported. It is evident from the forego- 
ing survey that the information on re- 
sponse to acceleration of target mo- 
tion is meager. Nor are the small 
caches of knowledge strategically 
placed for either theoretical or practi- 
cal purposes. Above all, little as yet 
can be stated in quantitative terms. 

The statement by Johansson may 
be broadened to include motions 
other than harmonic in the following 
way: When target velocity changes 
gradually, a person can tolerate a 
good deal of such change without re- 
alizing that the speed is not constant. 
This generalization may prove to be 
of value in the development of a 
coherent point of view. A closely re- 
lated but not identical suggestion is 
that the operator's perceptual mech- 
anism integrates smoothly changing 
velocities over a considerable period 
of time. Evidence for this conclusion 
was obtained by the writer in his stud- 
ies of prediction-motion. It is also the 
view of the writer that the early work 
on the complication pendulum may 
be explained partially in terms con- 
sonant with the foregoing formula- 
tions. First, S must require some time 
after the instantaneous signal to be- 
come aware of it; an appreciable reac- 
tion time is one of the most predicta- 
ble aspects of behavior. Next, it is 





DETECTION OF ACCELERATION OF TARGET MOTION 


clear that Sallows for his reaction time 
in making his judgment of sound-pen- 
dulum coincidence, otherwise his 
error would always be positive. This 
is not the case: during the positively 
accelerated phase of harmonic mo- 
tion, S judges the pendulum position 
to be an earlier one than is actually 
correct. It is hypothesized that S 
attempts to extrapolate backward to 
the extent of one reaction-time inter- 
val. If he uses the velocity existing 
during a brief period after the in- 
stantaneous signal for the operation, 
apparently being content to disre- 
gard the fact that the velocity is 
changing, the obtained results would 
be expected. In the phase of positive 
acceleration he appears to base his 
extrapolation upon velocities that 
have become too high for correct 
localization and in the phase of nega- 
tive acceleration upon velocities that 
It should be 
pointed out that the same qualitative 
predictions would be made by assum- 


have become too low. 


ing that S computes rates by instan- 


taneous differentiation rather than 
by integrating over a period of time. 
The quantitative data do not allow 
selecting between the alternatives. 
In any event, S acts very much as 
though he were unaware that the 
velocity is changing. 

The study by Hick, on the other 
hand, shows S to be an extremely ac- 
curate discriminator of velocities, one 
who can sometimes detect an increase 
of 2.5 per cent and who needs no more 
than 0.25 sec. either before or after 
the change to make the discrimina- 
tion. How may this description be 
reconciled with that of the uncritical 
S who can tell that harmonic motion 
is changing in velocity only at the 
periods near reversal of direction? 
The obvious difference between the 
stimulus conditions for the disparate 
observations is in the gradualness of 


481 


transition from one velocity to an- 
other. During a course of harmonic 
motion, acceleration is least just at 
the times that velocities are greatest, 
in the region of midcourse. Because 
there is so little relative acceleration 
in this region, it would be expected 
that it would go unnoticed. The re- 
verse holds true near the ends of the 
motion where the ratio of accelera- 
tion to velocity becomes high and 
finally approaches a value which is 
infinitely high. At the time of an in- 
stantaneous change of velocity, as 
introduced by Hick, acceleration is 
naturally infinitely high. 

Analysis of thresholds already de- 
termined. It may be profitable to 
consider the response to target ac- 
celeration in the general context of 
research on the perception of motion. 
This approach should be particularly 
useful in clarifying the language and 
and problems in the determination of 
thresholds. A systematic analysis of 
thresholds both obtained and obtain- 
able might accomplish several things. 
First, the very organization of the 
material should indicate the voids as 
well as the islands in our present 
knowledge. Second, it should de- 
lineate the operations necessary for 
the obtaining of thresholds, and the 
primary variables whose values must 
be specified. Third, it should reveal 
parallels among thresholds which 
have been studied and suggest extra- 
polations to kinds of motion beyond 
the scope of the original studies. A 
graphic technique and a system of 
notation were devised for conducting 
this analysis. 

The significant kinds of threshold 
relating to motion which are de- 
scribed in the literature are repre- 
sented in Fig. 2. The first has been 
called the threshold of motion (shown 
in 2A). Angular distance as a func- 
tion of time is represented in the 





482 


graph on the left. On the right, the 
function is that of angular velocity 
against time. Gordon (7), in a recent 
experimental study, definesthe thresh- 
old of motion as the lowest detectable 
angular velocity (distinguishing it 
from the threshold of displacement, 
which is the smallest angular distance 
over which motion of a given rate 
may be detected). In place of motion, 


Qt the three test motions 
>t ls aove the thresnoid —— 
—@ 6 of the threshold ne 
=n ln Delow the threshold —=— + —— 


- 
i | 


B. Difference T 
OF the three test notions: 
= 1 is above the threshokg —— 
"7 ~dieot the threshold ——— — 


¥'_ = is below the threshold —= S}——+ 
. 2) = 
j 71 Is the stondard motion —= 5} . 


eave lars _ a 


Time (+) 


& Threshoid OF instantoneous Change Of Velocity 


Of _the Tree test motions 

=~ f 8 above the threshoid — 
', =—@ wat the treshog—— ~ 
fd eet 
(—ee=| @ the standerd mohan= © 


Threshoid 
Ge, /01~ de, /dr yy — 4 


Time (1) 


Fic. 2. GRAPHIC REPRESENTATION OF 
Morton TuresHotps Waicu Have BEEN 
DETERMINED 


which is a general word, the present 
writer would substitute velocity, the 
unit in which the threshold is meas- 
ured. Further, since this is an abso- 
lute threshold, it should be called ab- 
solute threshold of velocity. Represent- 
ed in both graphs of Fig. 2A are three 
test motions:f,g,4. Motionfisshown 
as being more rapid than is necessary 
for detection, and motion / is too 
slow to be detected. Motion g (the 
dark line) is that whose velocity just 


ROBERT M. GOTTSDANKER 


permits detection. In the graph on 
the left, the threshold is the slope of 
the g line or ds,/dt; in the graph on 
the right, the threshold is shown by 
the height of the g line, or v,. The 
motions shown have been equated in 
time but could as logically have been 
equated in distance. Also, the par- 
ticular time used is an arbitrary de- 
cision of the experimenter. 
quently, the value of either the fixed 
time or distance employed must be 
specified. Troland reports that the 
minimum values generally found for 
this threshold lie between 1’ and 
2’/sec. (19, p. 380). 

Whereas the threshold 
is an absolute threshold, indicative of 
S’s accuracy in distinguishing be- 
tween motion and no motion, that 
represented in Fig. 2B is a difference 
threshold. It is a me how 
well S distinguishes between two mo- 
different velocities. This 
problem has not been studied in as 
ine detail as the absolute threshold. 
in the course of a series of investiga- 
tions by J. Brown on the percep- 
tion of motion, the study by Brown 
and Mize (4) furnishes an expression 
of the accuracy of Ss in equating the 
two moving 
squares on endless belts. The pro- 
portional ditference required for dis- 
crimination (Weber given 
by the writers is 0.024. However, it 
should be noted that this value actu- 
ally refers to the constant error (or 
obtained with the method 
limits and so does not carry the im- 
plied meaning. If the standard and 
test shown under the 
same conditions, there should be no 
over-all whatsoever, but this 
obviously not mean that the 
matching is perfectly accurate. 

Another variety of study bearing a 
relation to difference threshold is 
that of the motion parallax cue of 


Conse- 


foregoing 


‘asure of 


tions of 


velocities of sets ol 


fraction 


bias) 


motions are 


bias 
does 








DETECTION OF ACCELERATION OF TARGET 


distance discrimination. If two ob- 
jects move at the same linear veloc- 
ity, it is possible to tell which is 
nearer because it will have a higher 
angular velocity. A study by Graham 
et al. (11) shows that for two vertical 
needles moving horizontally at right 
angles to the line of regard, a differ- 
ence in distance of the needles from 
the S which gives rise to a differential 
angular velocity of about 30”/sec. 
will provide a threshold distance dis- 
crimination. 

The graphs for difference threshold 
of velocity, are shown in Fig. 2B. The 
axes carry the same meaning as in A. 
rhe dotted line, 7, represents the 
standard motion; f, g, and h# repre- 
sent three test motions which are 
respectively more divergent from the 
standard than is necessary for dis- 
crimination, just detectably different 
(at the threshold), and too much like 
the standard to be distinguished from 
it. On the left, the threshold is the 
ditference in slopes between test line 
g and standard line j, or ds,/dt—ds,/ 
dt. On the right the threshold is 
shown by the difference in height be- 
tween lines g and j or v,—2;. In addi- 
tion to the specification of time of 
presentation, it is evident that as the 
standard velocity is arbitrary, it too 
must be specified. As the graphs are 
merely illustrative no attempt has 
been made in Fig. 2 (or Fig. 3) to 
maintain equivalent the 
left and right sides. 

A somewhat related experiment 
should be mentioned. This is the field 
study by Biel and G. E. Brown (1), in 
which Ss were asked to estimate the 
linear velocities of various airplanes 
during their courses of motion. Low 
velocities were overestimated and 
high ones were underestimated. The 
S's knowledge of the performance 
characteristics of the several types of 
aircraft used had considerable influ- 


scales on 


WOTION 483 


ence on his judgments. Such a study 
could be represented in Fig. 2B by 
showing the actual rate of the plane 
as the dotted line, S’s mean judg- 
ment as a solid line, and his variabil- 
ity as a zone about his mean judg- 
ment. It also would be necessary to 
have the y axis represent linear in- 
stead of angular distance. 

As may be seen in Fig. 2C, the Hick 
experiment on the detection of in- 
stantaneous change of velocity may 
be represented formally in much the 
same manner as experiments on the 
difference threshold (Fig. 2B). The 
standard motion is now shown to pre- 
cede the test motions. Of course, on 
any one trial only one of the alterna- 
tives follows the standard. As the 
times of presentation of the standard 
and test motions may be varied inde- 
pendently of one another, both must 
be specified in. addition to the veloc- 
ity of the standard motion. 

Thresholds of acceleration. As was 
pointed out in the previous discus- 
sion, the discrepancy between Hick’s 
results and those of investigators us- 
ing harmonic motion could be at- 
tributed to difference in the response 
to smooth change in velocity and to 
discontinuous change. As far as 
thresholds of acceleration are con- 
cerned, further work with harmonic 
motion would appear to be of limited 
value as this motion is a single com- 
plex case in which the extent of ac- 
celeration runs the gamut during each 
cycle. Also, all higher orders of deriv- 
atives are present as well as accelera- 
tion. The detection of instantaneous 
change, although dealing with simple 
linear velocities (except at the point 
of change) is also a special case in 
which acceleration takes on an in- 
finite value. The general case for 
study would be that in which there 
was a constant amount of accelera- 
tion during a motion. Test motions 





484 


with different amounts of accelera- 
tion could then be compared. This 
would parallel the work on the abso- 
lute and difference thresholds of 
velocity. The kind of motion to be 
studied is necessarily that which is 
represented by the equation s=nt 
+m, the equation which provides 
for aconstant amount of acceleration. 
(Here s is usually measured in angu- 
lar units and ¢ in seconds of time.) 

No description has been reported 
of an experimental determination of 
threshold of acceleration, although 
Hick and Bates (13) report an im- 
pression gained from preliminary in- 
vestigation that rate must double 
every five seconds for acceleration to 
be noticed. Several different experi- 
mental procedures suggest themselves 
for obtaining absolute threshold of 
acceleration. Some of these may be 
mentioned. First, there could simply 
be judgment by S for each of the 
test motions as to whether the mo- 
tion was accelerated. Second, stand- 
ard constant-velocity motion could 
be presented in paired trials with the 
various test motions, S’s task being 
to decide which member of the pair 
was accelerated. Third, a technique 
analogous to that of Hick’s could be 
used in which S would be required to 
judge whether a test motion was pos- 
itively or negatively accelerated. 
Whatever technique is adopted, sev- 
eral test motions will be used which 
differ in amount of acceleration. 
They can be equated in both time 
and distance, and hence in mean 
velocity. 

In Fig. 3A the graphs represent a 
determination of the absolute thresh- 
old of acceleration, where each test 
motion is compared with a constant- 
velocity standard, which is shown by 
the dotted line. As in Fig. 2, the 
three solid lines, f, g, and A represent 
motions which are above threshold, 


ROBERT M. GOTTSDANKER 


just at the threshold, and below the 
threshold. In the graph on the left, 
the threshold is represented by the 
second derivative of the function of 
the line g, d*s,/d#. In the graph on 
the right the threshold is the slope 
of line g, dv,/dt. 

It may be noted that a manipula- 
tion was possible in this experiment 
which was not possible in the de- 
termination of the thresholds of 
velocity: the motions differ only in 
one respect, acceleration, but are the 
same in time and distance. When 
experimenting upon the velocity 
thresholds, the test velocities are 
naturally different. But also if the 
motions are equated in time they 
must differ in distance and vice versa. 
Comparison may be made between 
the right-hand side of Fig. 3A and 
the left-hand side of Fig. 2A. In 
both, the equations are seen to be 
linear, and the statements of thresh- 
old are parallel, dv,/dt as compared 
with ds,/dt. The intersection of all 
the lines at the same point in Fig. 3A 
shows how it was possible to equate 


A Absolute Threshold Of Accelerotion 








B. Difference Threshold Of Acceleration 





———— | le the stonderé ecasiereted motion 


Sug Ot Ga,/ 00] Oe 700 
Time it) “ ” “% 


Fic. 3. GRAPHIC REPRESENTATION OF 
THRESHOLDS OF ACCELERATION 








DETECTION OF ACCELERATION OF TARGET MOTION 


velocity. A similar arrangement in 
the case of Fig. 2A would simply 
mean that the motions would center 
about the same position, a considera- 
tion which is irrelevant in the de- 
termination of thresholds. As far as 
required specifications for the abso- 
lute threshold of acceleration are 
concerned, since acceleration may be 
varied independently of both time 
and distance, the particular time- 
distance combination used must be 
specified rather than only one or the 
other as in the case of the absolute 
threshold of velocity. 

Also as yet undetermined is the dif- 
threshold of acceleration. 
What would be desired is a measure 
of the accuracy with which S is able 
to distinguish between one extent of 
acceleration and another. In Fig. 3B 
the operations involved in determin- 
ing such a threshold are shown. A 
motion with a standard acceleration 
is shown by the dotted line and the 
usual three test motions by the solid 
lines. As in the case of absolute 
threshold, it is possible to equate the 
motions in both time and distance. 
In the graph on the left, the threshold 
is equal to the difference between the 
second derivatives of the functions 
represented by lines g, and 7: d*s,/d? 
—d*s;/d@. In the graph on the right 
the threshold is equal to the differ- 
ence in slopes of the g and 7 lines: 
ds,/dt—ds;/dt. The same parallels 
and differences exist between the 
right-hand side of Fig. 3B and the 
left-hand side of Fig. 2B as in the 
comparison made for Fig. 3A and 
Fig. 2A. As in the case of absolute 
threshold of acceleration, the particu- 
lar time-distance combinations used 
must be specified. In addition, the 
value of the arbitrary standard ac- 
celeration employed must be stated. 

The question may have occurred 
to the reader of whether it is really 


ference 


485 


worth while to determine a diflerence 
threshold of acceleration. After all, 
there is no end to the order of deriva- 
tives of motion. Certainly at some 
point there must be an end to the 
utility of determinations of absolute 
and relative thresholds. Perhaps the 
real question concerns the kinds of 
discriminations the operator can 
make. Evidently, if values of the 
third derivative of distance are suffi- 
ciently high, it too may be detected; 
the intuitive term ‘“‘jerk’’ has been 
applied to this characteristic by 
mechanical engineers. Perhaps it is 
beyond this point that the human 
operator has insignificant ability to 
discriminate. 

The constancy problem. One very 
important consideration has been 
slighted in all of the preceding discus- 
sion. It is that thresholds have been 
described in angular terms whereas 
an approximately linear path of mo- 
tion is probably more typical than a 
circular one. In the case of the abso- 
lute threshold of velocity the value 
for any given linear situation may be 
rather accurately specified in angular 
terms. This is because the arc which 
would be subtended is so small (often 
less than one second) that the angular 
rate is essentially constant through- 
out. Circular motions have been used 
predominently in this work. In the 
studies of difference threshold of 
velocity, the paths are necessarily 
longer. Linear paths have been used 
in most of the studies. Obviously the 
angular velocity must vary from 
point to point. Yet the statements 
of threshold are usually given in an- 
gular terms. The reason for this is 
clear. It is thus in order that the 
threshold may be stated independ- 
ently of S's distance from the moving 
object. In the same way, it may 
prove to be of more importance or 
interest to determine thresholds of 





486 


acceleration for targets moving in 
linear rather than circular paths. The 
same solution would appear to be 
necessary; a threshold would be 
stated in average angular value for 
the course of motion. 

The very fact that constant-veloc- 
ity linear motion is seen as such de- 
serves comment. It is a constancy 
phenomenon in the same sense as size 
constancy; equal linear extents are 
judged as equal when at different 
distances and thus differing in angu- 
lar extent. In the present case, the 
extent judged equal is linear velocity 
rather than linear distance. This 
point is not the same as that made 
by Johansson, (14, p. 255) who refers 
to the previously mentioned percep- 
tion of harmonic motion in which the 
velocity is taken to be unchanging 
for the greater extent of the motion 
as an example of constancy. There 
is no equating of equal linear extents 
but rather an inability to discrimi- 
nate among different velocities 


whether considered as linear or as 
angular. 

Constancy of motion was one of 
the problems investigated by J. F. 


Brown (3). His method was to have 
an S match velocities of moving ob- 
jects which were at different dis- 
tances, a procedure which corre- 
sponds exactly to the typical experi- 
ment on size constancy. It should be 
remarked that within each of the mo- 


tions in this experiment there must 


necessarily be the single-object con- 
stancy already identified; each ob- 
ject, although taking on a range of 
values of angular velocity, appears 
to move at a constant linear velocity. 

A parallel may be found in judg- 
ments of static magnitudes. Shape 
constancy can also be looked upon as 
a type of single-object constancy. 
There is a constancy situation even 
when a large square is put at some 





ROBERT M. GOTTSDANKER 


distance from the observer and di- 
rectly before him in the frontal plane. 
The angular distances are necessarily 
less at the two sides and at the top 
and bottom than in the center region. 
To the writer’s knowledge the ex- 
istence of a constancy phenomenon 
of objects so situated has neither been 
studied nor mentioned previously. 
When two objects are compared in 
the experiment on size constancy, 
there also exists the single-object con- 
stancy (of shape) within each of the 
figures. 

A related point on single-object 
velocity constancy is that not only 
does a target which is moving parallel 
with the ground change its angular 
velocity, but it also changes its angu- 
lar elevation. Angular elevation is 
low when the target is far off and 
high when it is near. The fact that 
it is seen as maintaining a level path 
could be called single-object con- 
stancy of direction. It would be of 
interest to know whether and to what 
extent a tracker is influenced by his 
tendency to perceive objects as mov- 
ing in a world of rectangular coordi- 
nates when his controls (such as 
cranks and hand-wheels) operate 
from angular inputs. 

No matter what aspect of the prob- 
lem of response to target motion is 
examined, it will be evident that far 
less has been done than remains to 
be done. Circular motion has been 
studied in some situations but not 
in others, similarly linear motion. 
There have been a few rather special 
studies of harmonic motion. How- 
ever, motion paths of greater com- 
plexity and in three dimensions have 
attracted no investigation. The pau- 
city of research on responses to ac- 
celerated motion and the absence 
even of discussion on higher order 
derivatives of motion has already 
been mentioned. The psychology of 








DETECTION OF ACCELERATION OF TARGET MOTION 


response to target motion lies in the 
future. 


SUMMARY 


The experimental literature on re- 
sponses to acceleration of target mo- 
tion was reviewed. One significant 
observation was that smoothly ac- 
celerated motion is generally re- 


sponded to as if the velocity were 


487 


constant. Suggestions were made of 
a basic approach toward obtaining 
thresholds of acceleration. Examples 
of studies on constant velocity mo- 
tion were included in order to develop 
a systematic graphic method of de- 
scribing experiments on motion. The 
phenomenon of velocity constancy of 
a single moving target was identified 
and generalized. 


BIBLIOGRAPHY 


1. Bret, W.C., & Brown, G. E. Estimation 
of airplane speed and angle of approach. 
Washington: Office of Naval Research, 
1949. (Contract N6ori-189. Project 
Number NR-143-151.) 

BorinG, E. G. A history of experimental 
psychology. New York: Appleton Cen- 
tury, 1929. 

. Brown, J. F. The visual perception of 
velocity. Psyc hol. Forsch., 1931, 14, 
199-233. 

. Brown, J. F., & Mize, R. H. On the effect 
of field structure on differential sensi- 


tivity. Psychol. Forsch, 1932, 16, 355- 


372. 

suRROW, N. T. The determination of the 
position of a momentary impression in 
the temporal course of a moving visual 
impression Psychol. Rev 
Suppl., 1909, 11, 1-63. 

GeIcer, M. Neue Complicationsversuche. 
Phil. Stud., 1903, 18, 347-436 

. Gorpvon, D. A. The relation between the 
thresholds of form, motion, and dis- 
placement in parafoveal and peripheral 
vision at a scotopic level of illumination. 
Amer. J. Psychol., 1947, 60, 202-225 

. GOTTSDANKER, R. M. The ac« 
prediction motion. eS exp 
1952, 43, 26-36. 

. GotrspanKER, R. M. Prediction-motion 
with and without vision. Amer. J 
Psychol., 1952, 65, 533-543. 

. GorTspaANKER, R. M. A further study of 
prediction-motion. Amer. J. Psychol., 
1955, 68, 432-437. 

11. Granam, C. H., Baker, K. E., Hecut, 
M., & Ltoyp, V. V. Factors influencing 


’ 
lfonogr. 


uracy of 


Psychol., 


monocular movement 
J. exp. Psychol., 1948, 38, 


thresholds for 
parallax. 
205-223. 

. Hick, W. E. The threshold for sudden 
changes in the velocity of a seen object. 
Quart. J. exp. Psychol., 1950, 2, 33-41. 

. Hick, W. E., & Bates, J. A. V. The 
human operator of control mechanisms. 
London: Ministry of Supply, 1950. 

. Jouansson, G. Configurations in the 
perception of velocity. Acta Psychol., 
1950, 7, 25-79. 

Jonansson, G. Configurations tn event 
perception; an experimental study. Upp- 
sala: Almqvist & Wiksells, 1950. 

. Kremm, O. Versuche mit dem Kompli- 
cationspendel nach der methode der 
selbsteinstellung. Psychol. Stud., 1907, 
2, 324-357. 

. MetTzGer, W. Beobachtungen _ tiber 
phanomenale Identitat. Psychol. 
Forsch., 1934, 19, 1-60. 

Prraum, C. D. Neue Untersuchungen 
iiber die Zeitverhaltniss der Appercep- 
tion einfacher Sinneseindrucke am 
Complicationspendel. Phil. Stud., 
1900, 15, 139-148. 

. TrRovann, L. T. The principles of psycho- 
physiology. New York: Van Nostrand, 
1929. 

. Von Tscuiscn, W. Ueber die Zeitverhalt- 
niss der Apperception einfacher und 
zusammengesetzer Vorstellungen. Phil. 
Stud., 1885, 2, 603-634. 

. Wunpt, W. Grundsiige der Physiologis- 
chen Psychologie. Leipzig: Engelmann, 
1874. 

Received December 27, 1955. 





PSYCHOLOGICAL BULLETIN 
Vol. $3, No. 6, 1956 





TRANSFORMED STATISTICS FOR USE IN 
TEST CONSTRUCTION 


HAROLD WEBSTER 
Mary Conover Mellon Foundation, Vassar College 


In most test construction situa- 
tions it is desirable, if not absolutely 
necessary, to select from the test 
items which are available either (a) 
those which contribute most to test 
reliability, or (6) those which have 
the strongest relationship to an ex- 
ternal (criterion variable, or else 
(c) those items which to some extent 
meet both requirements a and 6. In 
any case, 4 relatively large number 
of item-test or item-criterion statis- 
tics will usually be required in order 
to identify the items which will com- 
prise the best final test, and the en- 
suing computation can be very labori- 
ous. A number of writers (1, 2, 4, 6) 
have reported on the merits of group- 
ing the test or criterion distributions 
into a relatively small number of 
symmetrical categories for the pur- 
pose of simplifying the computation 
of such item statistics. The chief 
advantage of such coarse grouping is 
the increased economy in time spent 
on computation, which at the same 
time is accompanied by a minimum 
loss of information. There appears 
to be no readily available literature 
containing formulas which are both 
economical to apply and at the same 
time utilize highly efficient grouping. 
This paper is intended partially to 
remedy this need. 

It is well known that when fre- 
quency distributions are grouped 
into broad categories, the informa- 
tion lost decreases the efficiency of 
statistics computed from such data. 
It can be shown, however, that the 
loss is less for some kinds of divisions 
into categories than it is for others. 


488 


Flanagan (2) has shown that group- 
ing scores into symmetrically ar- 
ranged categories is relatively effici- 
ent when there are as many as five 
or seven categories. For example, 
seven categories containing, from low 
to high scores, the percentages of 
cases, 4, 8, 25, 26, 25, 8 and 4, for 
which the corresponding new scores, 
—3, —2, —1, 0, 1, 2 and 3, respec- 
tively, have been assigned will yield 
a (maximum) variance due to dif- 
ferences between categories of nearly 
95 per cent. The maximum variance 
between categories for five categories, 
if scored —2, —1, 0, 1, and 2, occurs 
when they contain, respectively, 9, 
20, 42, 20 and 9 per cents of cases; it 
is about 91 per cent. Traditionally 
much item selection has been carried 
out using distributions which have 
been divided, as recommended by 
Kelley (4), into only three categories 
containing 27 per cent low, 46 per 
cent middle and 27 per cent high 
scores. In this case, the variance be- 
tween categories is only 81 per cent 
For a given number of categories, 
moderate variation in percentages of 
cases assigned to different categories 
lowers the maximum less than might 
be expec ted. 

Although a large number of item- 
test or item-criterion relationships 
may be required, only a measure of 
the relative strength of such relation- 
ships is most often needed. Because 
of this fact, and because of empirical 
evidence indicating high accuracy, 
as well as high efficiency, for coarse 
grouping methods (2), it would seem 
worthwhile in most test-construction 


“e 
‘ 





TRANSFORMED STATISTICS FOR USE IN TEST CONSTRUCTION 


problems to follow Flanagan's recom- 
mendations: first to include a few 
additional cases to offset loss in effi- 
ciency, and then to apply a coarse 
grouping transformation. 


APPLICATIONS OF A PARTICULAR 
‘TRANSFORMATION 


Test (or criterion) means, and 
numbers of subjects N, and items n 
are invariants under the kind of area 
transformations discussed above. In 
any problem, because of the sym- 
metrical nature of the new scores, 
the means become zero; and m and 
N, being independent of the cate- 
gories, are constants of the trans- 
formation. The variance of the new 
scores is of course constant for any 
particular set of categories, even 
though it has been relatively increased 
by the coarse grouping (as well as 
absolutely reduced because of the 
smaller range of the new = scores). 
Similarly, correlation coefficients com- 
puted from coarsely grouped scores 
are attenuated, so that, for example, 
if required item-test or item-criterion 
relationships are the typical point- 
biserial r’s, then a correction should 
be applied. 

It is possible, however, to choose 
an efficient set of categories which at 
the same time contains proportions 
of cases such that the correction for 
coarse grouping is implicit in the 
formulas. A set possessing this com- 
putational advantage is one contain- 
ing 9, 19, 44, 19, and 9 per cents of 
cases, the new scores being, respec- 
tively, 2, 1,0, —1 and —2. The be- 
tween-categories variance for this 
transformation is 90.5 per cent, which 
is almost the maximum obtainable for 
5 categories. Formulas are given be- 
low for the more useful statistics after 
this particular transformation has 
been applied to test and criterion 
distributions. 


489 


If the transformed test scores are 
2, 1,0, —1 and —2, then the covari- 
ance Cyr of the original scores of test 
T with item 2 becomes the trans- 
formed covariance, 


Cyr’ = (2e+-f— g—2h)/N=Dyr/N. [1] 


In [1] the frequencies of a (preferred 
or correct) response for item 7 for 
papers assigned scores 2, 1, —1 and 
—2 are, respectively, e, f, g, and h. 
Subsequently D’s such as Dj will 
always refer to differences like the 
one in parentheses in [1], and primes 
will always indicate other trans- 
formed quantities. 

Next we write the item-test point- 
biserial correlation, 


rer=krir’, [ 


where & is the correction for grouping 
the test scores into the broad cate- 
gories. Assuming that the original 
test scores 7 are approximately nor- 
mally distributed, the value of k& can 
be shown (5, pp. 393-402) to be 
1.051. The standard deviation for 
the chosen score set, 2, 1, 0, —1 and 
—2, to which correspond categories 
containing percentages of cases 9, 
19, 44,19, and 9, is Sr’ = 1.049. Using 
these two values and [1] and [2], the 
item-test correlation, rir = Cir/S;Sr, 
is transformed as follows, 


Cir e( —) 1.051 Dir 
SSr  \SSr') 1.049 NS; 


Dir 


> 3 
NS; 


Statistics such as S;, the standard de- 
viation of 1, which apply to items 
alone, are unaffected by the trans- 
formation [3]. Setting 1.051/1.049 
equal to 1.000 (instead of 1.002) in 
[3] introduces an error which for the 
present problem is negligible. 
Solving [3] for Cir, we now obtain 





490 
Cir =SrDiz/N. [4] 


Replacing subscript T by subscript 
C in the foregoing equations gives 
analogous transformed values for a 
criterion distribution C; for example, 
expressions analogous to [3] and [4] 
are 
Cic ‘SiSe = Dic NS;,, 
and 
Cic = 5, Dic IN, [6} 


respec tively. 

From a well-known relation (3) the 
variance Vr of a test T containing n 
items may be written as the sum of 
the m item-test covariances. Using 
this fact and summing [4] for ” items, 


Vr=2C wr =Sr2D 1/N [7] 


Dividing [7] by Sr, an estimate of the 
standard deviation of the original 
test scores is obtained, 

[8] 
as a function of the item counts de- 
fined for use in [1]. Similarly, the 
square of [8] gives an estimate of the 
variance of the original scores. 

The validity coefficient, or correla- 
tion of test T with criterion C, may 
be written 


'T« ‘ Cre Sy Se =ZC ic S7Se. 


Substituting [8] and TCic, which 
[6] summed for n items, in [9], 


rro=ZDx =ZD,r. [10] 


Thus [10] is the test validity esti- 
mated solely from item counts. 

For item-selection purposes it is 
often required that the criterion cor- 
relation of an item @ significantly ex- 
ceed zero before including it in an 
experimental test. One way of 
achieving this is to include 7 only if 


Cic2 SiSc2/VN, [11] 


where z= 1.96, or some larger value 





HAROLD WEBSTER 


of the normal deviate corresponding 
to a known level of significance. Sub- 
stituting [6] in [11], 


Dic2SaVN- (12] 


Use of [12] as an item selection condi- 
tion has been discussed previously 
(6), where it was noted that setting 
S,;=.5, the maximum value, provides 
a conservative statistical test which 
has the practical effect of 
(a) that the test will contain (statisti- 
cally) valid items, (6) that most of 
the will 
Variances. 

Finally, it should be noted that 
[10] can also be written in another 
way, namely, 


insuring 


selected items have large 


tro=Rrre’ 


=(1.051)°E7"C’ (1.049)? 


[13] 


In [13] k? is the double correction for 
grouping both the T and C 
into the same-sized broad categories, 
and Sy’ =S-' =1.049. The final ap- 
proximation in [13], achieved by set- 
ting (1.051)*/(1.049)? 
(instead of 1.004 
for our purposes. 

In some test-construction problems 
it may be easier to obtain the sum of 
transformed products X7°C’ 
and use [13] than it would be either 
to compute variances from original 
scores or to obtain additional item 
counts for the purpose of estimating 
the validity by using [10]. For ex- 
ample, suppose it is required to con- 
struct an experimental 
specific’ test by applying [12] to a 
pool of items. After papers have been 
grouped according to their criterion 
scores C, and items for which [12] 
holds have been selected,! it is practi- 


scores 


equal to unity 


, is still close « nough 


cross 


“‘criterion- 


1 The papers should be marked at the time 
[12] is applied to indicate later to which cri- 
terion distribution category they belong. 





TRANSFORMED STATISTICS FOR USE IN TEST CONSTRUCTION 


cally always then necessary to ob- 
tain the variance, validitv, and re- 
liability of the test comprising the 
selected items. These values can im- 
mediately be approximated, with- 
out first having to tally raw test 
scores or to obtain item-test relation- 
ships, by using [13] as follows. 

First square [10], solve for ({D,;)’, 
aad substitute the latter in the square 
of [8] to obtain an expression for the 
original variance, 


Vr=(2Dic)*/(Nrre)?. 


(14) 


Substituting [14] for the variance in 
the formula for the Kuder-Richard- 
son reliabilitv formula 20, 


nN 
TT! 1 
n—1 


Summation terms in [14] and [15] are 
obtained by summing the Dye for 
items selected by [12], and the item 
variances V,; are obtainable as usual 
from the total item counts, also avail- 
able after using [12]. The validity 
coefficient is needed and can be esti- 
mated by [13]; once computed it may 
then also be used in [14] and [15]. 

To obtain 7’C’ for use in [13], re- 
group the into the § cate- 
gories, this time according to their 7 
scores, fill in the 5X5 contingency 
table for frequencies of the T’ and C’ 
scores (center categories may be ig- 
nored), and sum the 16 kinds of cross 
products. After the papers have been 
reordered according to T scores, the 
remaining operations take only a few 
minutes, even when there are a large 
number of subjects. 

It should be emphasized that the 
accuracy of the formulas is immedi- 
ately dependent upon tulfillment of 
the normality assumptions concern- 
ing the original test and criterion dis- 
tributions. In the case of item sta- 
tistics, departures from normality 
are, for reasons already discussed, 


(Nrro)*ZV 5 
_ Wn A, 113] 


(SDic)? 


papers 


491 


not likely to be serious; however esti- 
mates such as [10], [13], [14], and [15] 
depend upon normality assumptions 
for two distributions and therefore 
in practice should be regarded only 
as rapidly obtainable approximations 
to the actual values. 


AN EXAMPLE 


In a large-scale research a subtest 
comprising 33 true-false personality 
inventory items was for several rea- 
sons of theoretical interest. The 
items were taken from a larger mas- 
culinity-femininity factor scale, and 
appeared to measure ‘‘fantasy, sensi- 
tivity and esthetic interest,’’ and pos- 
sibly also some kind of ‘neurotic con- 
flict.” The KR-20 reliability of the 
33 items was .71. 

The 33 item subtest, hereafter re- 
ferred to as “‘X,’’ was scored for a 
new random sample of 200 college 
women. Statistics for the obtained 
distribution corresponding to the 
first four moments were X = 19.555, 
Vx =21.587, gi:=—.1851 and ge 
= — .4366. Although the distribution 
appeared slightly flattened and nega- 
tively skewed, test ratios for g,; and 
£2 (—1.08¢ and —1.28¢, respectively) 
offered no evidence that the popula- 
tion distribution was anormal. 

Papers were divided according to 
X scores into the five categories rec- 
ommended above, and item counts 
for 636 other true-false items were 
obtained (see [1]). Application of 
[12] with s=2.58 and S,;=.5 selected 
89 of the new items as potential cor- 
relates of XY. (Since z was chosen to 
correspond to the .01 level, only 
about 6 would be expected by chance 
alone). 

At the same time that item counts 
were obtained for the new items, 
counts were also obtained for the 33 
items in X. Since the reliability of X 
was only .71, it was not expected that 
its items would all correlate well with 





492 


the total score; indeed, applying con- 
dition [12] would retain only 22 of 
them. One empirical check on for- 
mula [8] was immediately available, 
however, using these counts; }D,x/N 
for the 33 items in X was 4.645, the 
square of which is 21.576, a value 
fairly close to 21.587, the actual 
sample variance of X. 

For purposes of this example, the 
89 new items were scored as a test 
T to be correlated with X. The 
papers were reordered according to T 
scores, and 27’ X’ obtained from the 
contingency table for 7’ and X’; this 
value was 154, which when used in 
[13] gave rry=.770. However, be- 
cause in a few cases different papers 
having the same T scores but differ- 
ent X’ scores could be assigned to dif- 
ferent 7’ categories in the contin- 
gency table, the estimated value of 
rrx could be made to vary between 
.770 and .795. The correlation ob- 
tained using the original T and X 


scores without grouping was .7703. 
The variance of the original scores T 
computed without grouping was found 


to be Vr =197.815. The sum of Dix 
for the 89 items was 2259; using this 
value and rrxy=.78 in [14] gives Vr 
= 209.693, which is about 6 per cent 


TPH 


aa 


HAROLD WEBSTER 


too large, but which is still close 
enough for a quickly computed ap- 
proximation. The sum of variances 
for the 89 items in T was 18.647. Us- 
ing the approximate value [14] in 
[15] gives .921 as an estimate of rrr, 
which is close enough to the correct 
value .916. 


CONCLUSIONS 


The transformation is relatively 
efficient and also appears to be pre- 
cise enough for general use. It can 
be applied in problems in which item 
counts are available for continuously 
distributed test or criterion 
The statistics discussed then 
more quickly estimated using the 
transformed values than they would 
be using the original scores. 

Use of such a transformation leaves 
open to question the effects of possi- 
ble departures from normality in the 
original test or criterion distributions. 
Judging from previous research in 
which test data have been normal- 
ized, and from examples such as the 
one in the present paper, these effects 
should not often be extreme enough 
to invalidate results obtained using 
the transformation. 


scores. 
are 


REFERENCES 


1. Apxins, Dorotny C., & Toops, H. A. 
Simplified formulas for item selection and 
construction. Psychometrika, 1937, 2, 
165-171. 

2. FLANAGAN, J. C. The effectiveness of short 
methods for calculating correlation co- 
efficients. Psychol. Bull., 1952, 49, 342- 
348. 

3. GuLLIKSEN, H. Theory of mental tests. 
New York: Wiley, 1950. 

4. Ketiey, T. L. The selection of upper and 


lower groups for the validation of test 
items. J. educ. Psychol., 1939, 30, 17-24. 

5. Peters, C. C., & VAN Voornis, W. R. 
Statistical procedures and their mathe- 
matical bases. New York: McGraw-Hill, 
1940. Chap. XIII; Table XLIII. 

6. WessterR, H. Maximizing test validity by 
item selection. Psychometrika, 1956, 21, 
153-164. 


Received February 2, 1956. 





PSYCHOLOGICAL BULLETIN 
Vol. 53, No. 6, 1956 


ON THE ORIGIN AND EARLY USE OF THE TERM 
VICARIOUS TRIAL AND ERROR (VTE) 


KARL F. MUENZINGER 
University of Colorado 


In their recent article in this Jour- 
nal on “Vicarious trial and error and 
velated behavior’ (2), Goss and 
Wischner say that “‘to this general 
pattern of behavior Muenzinger and 
Fletcher have given the name ‘vicari- 
ous trial-and-error,’ abbreviated 
‘VTE’.” The origin of this term should 
have been ascribed to Muenzinger 
and Gentry. In the article referred to 
by Goss and Wischner I say so (4, p. 
89), but it is possible that I was not 
explicit enough. 

It was Evelyn Gentry (now Evelyn 
G. Hooker) who made the first study 
of the phenomenon in an experiment 
designed for this purpose and not as 
a by-product of other experiments. 
Her results were described in 1930 in 
an M.A. thesis under my direction 
(1) in which the term vicarious trial 
and error with its abbreviation VTE 
were used, and in which reference was 
made to earlier descriptions of this 
kind of behavior by other experi- 
menters. , 

Our criterion for recording VTE in 
any one trial was then and still is ‘‘a 
facing into one alley before the other 
one, whether right or wrong, was en- 
tered”’ (3, p. 77). 

At first my co-workers and I 
thought that the presence of dis- 
criminanda within the choice alleys 
was the necessary condition for the 
occurrence of VTE. However, this 


view almost immediately turned out 
to be wrong because we observed (in 
1929) that VTE also occurred when 
the choice alleys contained no dis- 
criminanda. In this case an animal 
had to make a left or a right turn in 
conjunction with the presence or 
absence of a tone that was sounded 1 
meter above the choice point (3). 
We realized that it was the mere pres- 
ence of the choice alleys that pro- 
duced VTE. 

We have always emphasized the 
role of experimental conditions in the 
relative frequency of VTE. To illu- 
strate, our observations throughout 
the years have invariably shown that 
as compared with no shock the pres- 
ence of electric shock after the point 
of choice is accompanied by more 
VTE (3, p. 78). We have also found 
invariably that in a difficult discrimi- 
nation situation the frequency of 
VTE is higher than in an easier one 
(3, p. 81). 

We have always assumed that a 
relation between VTE and learning 
efficiency exists. This was in line 
with the notion prevalent 30 years 
ago that actual trial and error is in- 
dispensable in certain types of learn- 
ing. But we have also stated ex- 
plicitly that “we have not demon- 
strated a causal relationship between 
the two” (3, p. 84). 


REFERENCES 


1. Gentry, E. A substitution for trial and 
error in the white rat. Unpublished 
master’s thesis, Univer. of Colorado, 
1930. 

2. Goss, A. E., & WiscuNner, G. J. Vicarious 


trial and error and related behavior. 
Psychol. Bull., 1956, $3, 35-54. 

3. MueEnzinGerR, K. F. Vicarious trial and 
error at a point of choice: I. A general 
survey of its relation to learning effi- 


493 








494 KARL F. MUENZINGER 
ciency. J. genet. Psychol., 1938, 53, 75- food tension in the visual discrimination 
habit. J. comp. Psychol., 1936, 22, 79- 
91. 


86. 
4. Mvuenzincer, K. F., & FLETCHER, F. M. 
Motivation in learning: VI. Escape from 
Received March 20, 1956 


electric shock compared with hunger- 


ERRATUM 


In ‘‘The Water-Jar Einstellung Test as a Measure of Rigidity,” by Eugene 
E. Levitt, Vol. 53, No. 5, September, 1955, p. 368, right-hand column; 
1. After eight years of research, evidence for the validity of the 


ae 


For: 
water-jar test as a measure of validity is still lacking.” 

Read: “1. After eight years of research, evidence for the validity of the 
water-jar test as a measure of rigidity is still lacking.” 











